fix(run): inject API token for LB cross-endpoint calls#347
Conversation
flash run dispatches LB routes to a remote worker via lb_execute, which bypasses create_resource_from_manifest, the path that injects RUNPOD_API_KEY in flash deploy. _inject_runtime_template_vars only injected the token for QB endpoints, so an LB endpoint making a cross-endpoint call (e.g. the pipeline example) ran without a token and failed with HTTP 401 "no token provided". Inject RUNPOD_API_KEY for any endpoint with makes_remote_calls=True regardless of type, keeping flash run and flash deploy symmetric. The injection is idempotent: the existing-key guard skips it when the manifest path already populated env. Fixes SLS-336.
|
Promptless prepared a documentation update related to this change. Triggered by flash PR #347 (fix: inject API token for LB cross-endpoint calls, SLS-336) Since this fix makes cross-endpoint (pipeline) calls work under Review: Note that cross-endpoint calls work locally under flash dev/run |
There was a problem hiding this comment.
Pull request overview
Fixes a flash run vs flash deploy behavior gap where load-balanced (LB) endpoints that make cross-endpoint calls were missing RUNPOD_API_KEY, causing inter-endpoint requests to fail with 401 "no token provided".
Changes:
- Updated
ServerlessResource._inject_runtime_template_vars()to injectRUNPOD_API_KEYfor any endpoint type when_check_makes_remote_calls()is true, not just QB endpoints. - Added a unit regression test ensuring LB deployments inject
RUNPOD_API_KEYwhen remote calls are enabled.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/runpod_flash/core/resources/serverless.py |
Hoists API key injection out of the QB-only branch so LB endpoints provisioned via flash run also receive RUNPOD_API_KEY when needed. |
tests/unit/resources/test_serverless.py |
Adds regression coverage for LB deploy path injecting RUNPOD_API_KEY when makes_remote_calls=True. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Live smoke test — passedRan
All four provisioned |
Summary
Under
flash run, an LB endpoint that calls another endpoint (the pipeline/orchestration pattern) failed with401 "no token provided"on the inter-endpoint call.flash deployworked, so this was a live↔deploy asymmetry.Root cause
flash rundispatches an LB route to a remote worker vialb_execute→ResourceManager.get_or_deploy_resource→_do_deploy→_inject_runtime_template_vars. That method only injectedRUNPOD_API_KEYin its QB branch; the LB branch injected onlyFLASH_MODULE_PATH. So the LB worker running the route had no token and its cross-endpoint call returned 401.flash deployworks because it builds resources throughcreate_resource_from_manifest(runtime/resource_provisioner.py), which injectsRUNPOD_API_KEYfor any resource withmakes_remote_calls=True.lb_executebypasses that path entirely.Fix
Hoist the
RUNPOD_API_KEYinjection out of the QB-only branch in_inject_runtime_template_varsso it runs for any endpoint where_check_makes_remote_calls()is true, regardless of type. LB endpoints still also getFLASH_MODULE_PATH. The injection is idempotent — the existing"RUNPOD_API_KEY" not in env_dictguard skips it when the manifest path already populated env, soflash deploybehavior is unchanged.Test plan
test_do_deploy_lb_injects_api_key_when_makes_remote_calls(TDD: verified RED before fix, GREEN after).make quality-checkpasses: ruff format + lint clean, full suite green, coverage 85.99% (≥65%).flash runin01_getting_started/03_mixed_workers+POST /pipeline/classifyreturning COMPLETED.