fix(jitsu-cli): fetch function list once, not per file#1306
Conversation
Previously deployFunction() refetched the entire workspace function list on every iteration to resolve slug → id. With N functions in the workspace that's N list calls, each returning N rows including the full code blob — O(N²) load on the console and Postgres, and the visible cause of console CPU spikes during fusionmedialimited/jitsu-functions deploys. Hoist the GET out of the loop into deployFunctions(), build a slug/id lookup once, and pass it down.
There was a problem hiding this comment.
Found a few correctness regressions in the provided change range:
-
High — Firestore pagination can skip/duplicate documents
bulker/connectors/firebase/firebase.go:284,350- The first batch is fetched with
collection.Limit(batchSize)(no deterministic ordering), but subsequent batches switch toOrderBy(firestore.DocumentID).StartAfter(lastDoc.Ref.ID). Using a cursor derived from an unordered page against an ordered query is not stable, so large collections can miss or re-read records. - Please keep the ordering consistent from the first page (same
OrderByfor every batch) and use the same cursor form across pages.
-
Medium —
CRON_ENABLEDis no longer respecteddocker-start-console.sh:14-15init()now always startscron, whereas previous behavior honoredCRON_ENABLED=0/no/false.- This is a user-visible behavior change and can trigger background jobs in environments that explicitly disabled cron.
-
Medium — API rate limiting protections were removed without replacement
webapps/console/lib/api.ts(rate limiter enforcement block removed innextJsApiHandler), andwebapps/console/pages/api/workspace/index.ts(POSTroute-specific cap removed).- This drops request throttling safeguards, including the explicit workspace-creation cap, and reopens abuse/DoS paths for authenticated traffic.
- If this is intentional, we need an equivalent protection at another layer before merging.
| // Fetch the existing function list once for slug/id resolution. Previously | ||
| // every deployFunction() call refetched the whole list (with code blobs) — | ||
| // O(N) requests per deploy, each pulling all N rows from the console DB. | ||
| const existingFunctions = await fetchExistingFunctions({ host, apikey, workspaceId: workspace.id! }); |
There was a problem hiding this comment.
Caching this list once changes deploy semantics within a single run: after a successful create, the cache is not updated, so later files still see not found and can attempt a second POST for the same slug/id.
Before this change, each file re-fetched and the second file switched to PUT. Please update the cache after successful POST (and on relevant PUT slug changes) to preserve previous behavior.
There was a problem hiding this comment.
Good catch — fixed in d5659c9. Two helpers now mutate the cache on success: cacheAfterCreate (POST) inserts the new {id, slug} into both maps, cacheAfterUpdate (PUT) refreshes the slug index (dropping the old slug → id link when the slug was renamed). The entry is the same object reference in byId and bySlug, so one mutation covers both. A second file with the same slug now resolves to the freshly-created id and switches to PUT, matching the pre-hoist behavior.
Without this, hoisting the GET out of the per-file loop changed
within-run semantics: after a successful POST the cache was stale, so a
later file with the same slug or meta.id would see "not found" and POST
a second time, producing a duplicate function. The pre-hoist code
side-stepped this because each file re-fetched the full list.
Restore the original behavior by mutating the cache on success:
- POST: insert the new {id, slug} into both byId and bySlug.
- PUT: update the entry's slug; if it differs from what we had, drop the
stale bySlug entry so a later file deploying under the old slug
creates a new function instead of clobbering this one.
Entry refs are shared between bySlug and byId (single object identity),
so one mutation covers both maps.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
jitsu-cli deploywas making O(N²) requests to the console: for every function being deployed,deployFunction()re-fetched the entire workspace function list (including each function's code blob) just to resolveslug → id. Onfusionmedialimited/jitsu-functions(≈63 functions) that's ~63 list calls × 63 rows = ~3,900 row reads per deploy, on top of the 63 PUTs. It's the visible cause of console pod CPU spikes during deploys.This PR hoists the
GET /api/<ws>/config/functioncall out of the per-file loop intodeployFunctions(), builds slug/id lookup maps once, and passes them down. Network and DB load drop fromN + N×NtoN + 1.Why this matters
clpwf99ji0000jp0fi5hl01t8workspace currently take ~4 min and pin one or two console pods to high CPU for that window (see run 26035361978). The hot path was the redundant list fetch, not the write.pages/api/[workspaceId]/config/[type]/index.ts) returns every config object's fullconfigJSON; for functions that includes the compiled code. Multiplying that by N is wasteful even ignoring DB load.Behavior preserved
slugfirst, fall back tometa.id.process.exit(1)).deployFunction()keeps an optional default-empty lookup so it stays callable standalone (e.g. from tests).POST/PUTpayloads or to profile-builder handling.Test plan
pnpm typecheckincli/jitsu-clipnpm buildincli/jitsu-clipnpm jitsu-cli deploy --workspace … --apikey …against the QA workspace and confirm all functions deploy + console CPU stays flatjitsu-clipackage so thefusionmedialimited/jitsu-functionsCI picks it upOut of scope (follow-ups worth doing)
deployFunctionwithkind === "profile"but the GET inside still hit/config/function— looks like a long-standing bug (profiles never resolve theirexistingFunctionId, so they get POSTed-new every deploy). Preserving for now to keep this PR minimal.?fields=id,slugprojection on the list endpoint so even the single hoisted call doesn't pull every function's code.requests/limits+ HPA tojitsu-console(separate PR injitsu-cloud-infra).