feat(producer): add --mode=lambda-local to the regression harness#913
feat(producer): add --mode=lambda-local to the regression harness#913jrusso1020 wants to merge 12 commits into
Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
69ee7ba to
f5512ff
Compare
c8a3a10 to
c0895ef
Compare
vanceingalls
left a comment
There was a problem hiding this comment.
One-line: clean Phase 6b idea but the PR is currently red across required checks — the new module is statically loaded by every harness run, and the tsconfig paths change breaks the CLI bundle.
Calibrated strengths
regression-harness-lambda-local.ts:32-37— the choice to not go through Docker/RIE and instead drive the in-process handler with a filesystem-backed S3 is the right tradeoff for in-CI coverage. The doc-comment explaining why (Docker-in-Docker availability vs. RIE-only smoke) is exactly the kind of design rationale future maintainers need.regression-harness-lambda-local.ts:90-102— pre-staging the project as atar.gzunder the fake-S3 root mirrors prod'sdeploySitepath, so the handler's tar extraction logic is actually being exercised. This is what makes the new mode catch event-shape / key-convention regressions thatdistributed-simulatedcan't.regression-harness-distributed.test.ts:35-37— the new "error message lists all three accepted modes" assertion is the right pin: it tests the user-visible contract, not just the parsing branch.
Blockers
-
regression-harness.ts:34statically importsregression-harness-lambda-local.ts, which then statically imports@hyperframes/aws-lambdaat module top-level. This breaks every regression-shard run in CI today, not just--mode=lambda-localones. Evidence — shard-5 (--mode=in-process, fixturesstyle-4-prod style-11-prod style-2-prod animejs-adapter typegpu-adapter) fails on this PR with:Error [ERR_MODULE_NOT_FOUND]: Cannot find module '/app/packages/producer/node_modules/@hyperframes/aws-lambda/src/index.ts' imported from /app/packages/producer/src/regression-harness-lambda-local.tsRoot cause:
Dockerfile.testonlyCOPYspackages/{core,engine,player,producer}/into the image (lines afterCOPY package.json bun.lock). It does NOT copypackages/aws-lambda/.bun installcreates the workspace symlink, but the target dir is absent inside the container, so the static import fails at module load and the harness can't even start.Fix shapes: (a) lazy-import
runLambdaLocalRenderand its sibling imports behind themode === "lambda-local"branch (dynamicawait import(...)), or (b) addCOPY packages/aws-lambda/ packages/aws-lambda/toDockerfile.test. (a) is cleaner — keeps the producer test image lean and matches the PR's framing that the dep is harness-only. The unit-test path also needs the lazy import or a vitest mock, because the test file atregression-harness-distributed.test.tsdoes NOT import the new module but the harness module itself does — anyone importing harness types in a test pulls aws-lambda into the graph. -
The new tsconfig
pathsaliases break the CLI bundle.packages/producer/tsconfig.json:18-21adds"@hyperframes/producer": ["./src/index.ts"]and"@hyperframes/producer/distributed": ["./src/distributed.ts"]. This is tsconfig-only — butpackages/cli/tsup.config.ts:60-62separately sets:options.alias = { "@hyperframes/producer": resolve(__dirname, "../producer/src/index.ts") };esbuild then resolves
@hyperframes/producer/distributed(imported frompackages/aws-lambda/src/sdk/deploySite.ts:20, now reachable from the CLI graph via the new producer→aws-lambda dep) as<...>/src/index.ts/distributed— a path inside a file, which fails. The Perf:scrub job ran this on the head SHA and produced:✘ [ERROR] Cannot read directory "../producer/src/index.ts": not a directory ✘ [ERROR] Could not resolve "/home/runner/.../packages/producer/src/index.ts/distributed" (originally "@hyperframes/producer/distributed")Fix: extend the esbuild
options.aliasblock inpackages/cli/tsup.config.tsto mirror the tsconfig pair, e.g. add"@hyperframes/producer/distributed": resolve(__dirname, "../producer/src/distributed.ts")before the bare@hyperframes/producerentry (esbuild matches longest-prefix; ordering matters for some bundler versions, worth being explicit). Same fix likely needed wherever else this alias is re-declared (preview server, studio bundler, etc.) — please grep for"@hyperframes/producer":across*tsup*/*vite*/esbuildOptionsand audit each site. Rule 2: contract is "the alias map needs both keys"; the trigger is the CLI build, but every bundler that aliased@hyperframes/producerhas the same precondition now. -
PR test-plan checkbox 4 is unchecked ("End-to-end fixture pass via docker:test:lambda-local"), and the head SHA's CI never exercised it either —
regression-shardsdoesn't pass--mode=lambda-local. The author flags this as "maintainer-run before merging the stack," but that means no in-CI evidence that the new code path renders a single fixture end-to-end when this PR (or the stack it sits on) merges. For a PR whose entire purpose is "catch regressions the existing in-CI suite doesn't," shipping it without one CI shard that actually runs--mode=lambda-localdefeats the goal. Add at least one shard (or a smoke fixture) on the new mode to the regression matrix before merge, even if it's a fast subset. Otherwise the cost of carrying the new mode is paid (extra dep, bundle complexity, fragile path-alias setup) without the benefit accruing.
Important
-
regression-harness-lambda-local.ts:118-119castsfakeS3 as unknown as HandlerDeps["s3"]. That's a sharp edge — the test S3 only implements three commands (Get,Put,Head). When the handler evolves to callDeleteObject,CopyObject,ListObjects, etc., the cast suppresses the type error and the harness fails at runtime withFakeS3: unexpected command <CmdName>from the catchall at line 219. Two cheap improvements: (a) makeFilesystemBackedFakeS3extend an actual interface (e.g.Pick<S3Client, "send">typed against the supported commands), and (b) add a positiveparseHarnessModeFlag-style unit test that round-trips each command against the fake. Otherwise the silent-runtime-only failure mode is exactly the regression class this PR is meant to surface. -
No timeout / no cleanup of
tempRootif a fixture throws.runLambdaLocalRenderwrites totempRoot/s3andtempRoot/lambda-tmpand never cleans them up locally. The caller inregression-harness.tspresumably does, but please verify on the unhappy path (chunk render throws after step B/4 succeeds) — partial-state leftover dirs across fixtures in the same suite can produce false-pass on the next run if the fake S3's<key>collides. Same shape applies todistributed-simulated, so this isn't new; just worth confirming the cleanup wraps both paths the same way. -
Config.width/heighthardcoded to 1920×1080 with a comment "any positive integer works." That comment claims the composition'sdata-width/data-heightoverrides — but this is a Lambda-event-shape test specifically meant to catch SFN-contract drift. If the handler ever starts honoringConfig.width/height(e.g. for canvas sizing in a regression we haven't shipped yet), the harness will silently render 1920×1080 instead of the fixture's true resolution and pass against the golden — defeating the purpose. Either (a) pull the actual width/height fromsuite.meta.renderConfigand pass them through, or (b) add a static check that the fixture's resolution matches the hard-coded values. Mirroringdistributed-simulatedexactly here is what the comment says, butdistributed-simulateddoesn't go through the event-serialization boundary; this mode does, and that's the whole point.
Nits
regression-harness-lambda-local.ts:223—export type { Fps };re-exports a type the file imports but never uses externally. Drop it or use it in theRunLambdaLocalInput.fpsfield type.regression-harness-lambda-local.ts:74-77— comment says "required-by-type but overridden by composition attrs"; if (a) above lands, this comment goes away.- The PR title is
feat(producer)but it also modifiesbun.lockand adds a dep on@hyperframes/aws-lambda; considerfeat(producer,deps)or noting the dep-graph change in the body's "How" more prominently. Future bisects will appreciate it.
CI status (pulled from check_runs on c0895ef):
regression(required): FAIL — shard-5 and shard-6 both broken by the static-import issue above.player-perf(required): FAIL —Perf: scrubfailed atbun run buildwith the tsconfig-paths/esbuild-alias mismatch above.Preflight (lint + format): pass.preview-regression: pass.
Both required-check failures are caused by the changes in this PR, not pre-existing red. They're the blocking findings above.
Verdict: REQUEST CHANGES
Reasoning: Two required CI checks are red, root-caused to this PR (static-import of a workspace package that Dockerfile.test doesn't copy, and a tsconfig-only paths mapping that the CLI's esbuild alias doesn't mirror). Both have small, mechanical fixes; the design itself is sound. Also flagging that the new mode has no in-CI fixture coverage on this PR, which materially weakens the value proposition.
— Vai
c0895ef to
6faac80
Compare
fff432d to
8a006ca
Compare
6faac80 to
15289f3
Compare
8a006ca to
5c99590
Compare
vanceingalls
left a comment
There was a problem hiding this comment.
Thanks for the turnaround. Re-reviewing against the three blockers from my prior review (4304557853) on head 15289f37.
Blocker 1 — static import of regression-harness-lambda-local → @hyperframes/aws-lambda: RESOLVED.
regression-harness.ts:35-46 now wraps the load in a loadLambdaLocalRender() helper that does await import("./regression-harness-lambda-local.js") on demand, and the only call site at line 943 sits inside if (options.mode === "lambda-local"). The static-import edge from the harness module is gone, so a producer-only Dockerfile.test image no longer pulls aws-lambda into the load graph when running other modes. The doc-comment at lines 35-40 captures the why (Dockerfile.test doesn't copy packages/aws-lambda/), which is exactly the kind of rationale that prevents this regressing in a future cleanup.
Blocker 2 — CLI bundle broken by tsconfig-only paths alias: RESOLVED.
packages/cli/tsup.config.ts:81-89 now adds the explicit subpath alias:
"@hyperframes/producer/distributed": resolve(__dirname, "../producer/src/distributed.ts"),
ahead of the bare-package alias, with a comment explaining the esbuild prefix-substitution misfire — the exact fix I called out. The pattern is also extended to @hyperframes/aws-lambda/sdk (lines 90-92), which is the same shape and would have bitten on the next subpath import.
Verified against CI on 15289f37: Perf: scrub, Perf: parity, Perf: drift, Perf: load, Perf: fps all PASS (where the prior head was red on Perf: scrub at bun run build). That's the head-on evidence that the alias chain bundles.
Audit of other alias sites I asked about: packages/cli/tsconfig.json:8-9 and packages/aws-lambda/tsconfig.json:13-14 both already declare both keys; no other tsup/vite bundler in the repo aliases @hyperframes/producer, so the surface is fully covered.
Blocker 3 — no in-CI fixture coverage of --mode=lambda-local: STILL OPEN.
grep -rn "lambda-local\|mode=lambda" .github/ returns nothing on this head. The regression workflow's eight shards still pass --mode=in-process (default) or --mode=distributed-simulated; none run --mode=lambda-local. The only coverage of the new mode in CI is the parser unit test at regression-harness-distributed.test.ts:21-23 ("parses --mode=lambda-local") — which proves the flag is recognized, not that a fixture renders end-to-end through the handler.
This is the same gap I flagged before: the PR's entire value proposition is "catch event-shape / SFN-contract regressions the existing suite can't," but no fixture exercises the new path in CI, so a regression in handler.ts, events.ts, deploySite.ts, or the new harness module itself ships green. The test-plan checkbox "End-to-end fixture pass via docker:test:lambda-local" remains unchecked, and there's no docker:test:lambda-local invocation in the workflow files. Severity: important, not blocking — reasonable people can disagree on whether this needs to land in the same PR or a fast follow-up.
If a fast-follow PR adding one lambda-local shard (even a single smoke fixture) is already planned and tracked, this is fine to merge as-is. If not, I'd push to add it before merge — even a one-fixture shard catches the regression class this whole effort exists to catch. Author's call; flagging so it doesn't get lost.
Importants from prior review — status check:
fakeS3 as unknown as HandlerDeps["s3"]cast atregression-harness-lambda-local.ts:97: unchanged. Catchall at line 224 (throw new Error("FakeS3: unexpected command ${cmdName}")) still surfaces it as a runtime failure when the handler grows a new command. Not blocking, still worth tightening later — the cheap version is typingFilesystemBackedFakeS3against the actualS3Client["send"]signature.- 1920×1080 hardcoding at line 951: unchanged. The doc-comment at
RunLambdaLocalInput.width/height(lines 65-72) is honest about the limitation. Still flags as the right kind of debt to log but not block on.
CI status on 15289f37:
Perf: scrub,Perf: drift,Perf: parity,Perf: load,Perf: fps: PASS (all previously broken-or-affected by blocker 2).Preflight (lint + format): PASS.Preview parity: PASS.regression-shards (1–8): IN_PROGRESS at review time; the prior failure mode (shard-5 / shard-6 on the static-import) is structurally fixed by blocker 1, so I expect green. If they go red on something else, that's a new finding.- The
Perf: ${{ matrix.shard }}"fail" entry is a 0s cancelled stub from the matrix-name-only job and isn't a real failure.
Verdict: APPROVE.
The two CI-blocking fixes are exactly what I asked for and verifiably resolved on this head. Blocker 3 (in-CI fixture coverage of the new mode) is the only outstanding item from the original review, and I'm downgrading it to important rather than blocking — but please don't let it drop, since shipping a "catch regressions" feature with no fixture coverage materially weakens the value. A follow-up PR adding one shard is fine.
— Vai
miguel-heygen
left a comment
There was a problem hiding this comment.
Re-approved. Vai verified both blockers fixed: dynamic import gated on mode, tsup alias for producer/distributed. Regression shards passing. Lambda-local CI coverage deferred as follow-up — acceptable.
The base branch was changed.
Wraps the @hyperframes/aws-lambda SDK + the Phase 6a SAM template behind
a single CLI surface so an end-to-end render is three commands instead
of the ~8 manual bun+sam+aws steps the smoke script does today:
hyperframes lambda deploy
hyperframes lambda render ./my-project --width 1920 --height 1080 --wait
hyperframes lambda destroy
Subcommands:
- deploy: build handler.zip + sam-deploy + persist stack outputs
to <cwd>/.hyperframes/lambda-stack-<name>.json
- sites create: pre-upload a project to S3 with a stable content hash
so re-renders skip the tar+PUT pass
- render: start a Step Functions execution; --wait blocks and
streams per-chunk progress + accrued cost
- progress: one-shot snapshot — status, frames, cost breakdown,
errors. Accepts renderId or executionArn
- destroy: sam-delete + drop the local state file (S3 bucket
is Retain'd by the template; documented in --help
and in docs/packages/cli.mdx)
To keep @sparticuz/chromium out of the CLI's transitive deps, this also
adds a dedicated ./sdk subpath export to @hyperframes/aws-lambda; the
CLI imports from @hyperframes/aws-lambda/sdk exclusively. The existing
. barrel still re-exports both handler + SDK for adopters who want one
entry point.
Defaults are deliberately cost-conservative for first-time users:
--concurrency=8 (low enough to never surprise) and --memory=10240 (the
common case; documented for adopters who want to tune down).
Tests: 5 unit tests on the state-file round-trip. CLI integration
against sam local invoke is part of the upcoming PR 6.6 (lambda-local
regression harness).
Two small cleanups on top of the lambda CLI:
- Replace parseFormat / parseCodec / parseQuality / parseChromeSource
(four near-identical helpers) with a single generic parseEnum() +
typed const-tuple lookups. The four callers now read as one-line
arrow functions that lift the allowed values out of the function
body so they're easy to extend.
- DEFAULT_STACK_NAME was const-declared then re-exported at the
bottom of state.ts; just mark the const export inline.
No behavior changes. All CLI tests still pass.
esbuild can't bundle @hyperframes/aws-lambda's transitive AWS SDK
deps (@aws-sdk/* + @smithy/*) cleanly into a node binary — the
SDK's .browser.js conditional re-exports break the resolver:
ESM Build failed
No matching export in "splitStream.browser.js" for import
"splitStream" (and ~10 similar errors)
Mark aws-lambda as `external` so esbuild doesn't follow it, and
move it from devDependencies to dependencies so the published CLI
can resolve it from node_modules at runtime. The lambda subverb
files dynamic-import only on `hyperframes lambda *` invocation, so
the CLI cold-start cost is unchanged.
The install-size hit (AWS SDK + @sparticuz/chromium ≈ 200 MiB) is
documented as a v1 tradeoff; a future split into a lambda-sdk-only
subpackage can pare this back.
Two blockers + four important items from Vai's review:
- `--memory` was parsed and recorded in the local state file but
never forwarded to `sam deploy` as a parameter override. Worse,
`progress.ts` then read the *recorded* value for cost math, so
`--memory 5120` produced wrong cost numbers downstream. Thread
`LambdaMemoryMb` through samDeploy's --parameter-overrides.
- `--profile` was only consumed by deploy / destroy. render and
progress fell back to the default credentials chain — a user
with `--profile prod` would silently render against their
default account (wrong-account billing footgun). Set
`process.env.AWS_PROFILE` (and `AWS_REGION`) in the dispatcher
before any subverb runs; the AWS SDK reads them natively, so
render / progress / sites all benefit without each subverb
threading the flag through the SDK call.
- `--profile` + destroy now also reads `process.env.AWS_PROFILE`
as a fallback (matching deploy's existing env fallback).
- `--wait --json` printed both the start handle AND the final
progress snapshot, producing two concatenated JSON blobs that
`jq` rejected. Now emits a single document: handle (without
--wait) OR final progress (with --wait).
- Negative integers on `--width` / `--height` / `--chunk-size` /
`--max-parallel-chunks` / `--memory` / `--concurrency` now fail
loudly via a new `parsePositiveInt` wrapper instead of flowing
into the SDK and producing opaque AWS validation errors mid-
render.
- `DEFAULT_STACK_NAME` is now centralized to the literal
`"hyperframes-default"` and consumed from one place. Previously
the value was assembled as `hyperframes-${"default"}` in three
sites and hardcoded as `"hyperframes-default"` in a fourth.
`requireStack`'s hint now matches the dispatcher's default.
The faked `SiteHandle` for `--site-id` keeps the documented
placeholder fields but also surfaces `bucketName` (from PR 909's
extended SiteHandle interface), matching the SDK contract.
All CLI unit tests + the full bundler build still pass.
15289f3 to
7c1e316
Compare
miguel-heygen
left a comment
There was a problem hiding this comment.
Re-approve after rebase. Diff verified unchanged.
vanceingalls
left a comment
There was a problem hiding this comment.
Re-approve after rebase onto main. Force-push dismissed my prior --approve (require_last_push_approval: true) — content unchanged, same commits replayed on the new base. All findings from the prior review's resolution still apply.
Re-review by Vai (post-rebase re-stamp)
The "Smoke: global install" CI step packs the CLI via `npm pack` and
installs it globally via `npm install -g <tgz>`. npm doesn't understand
the workspace: protocol, so a runtime `dependencies` entry of
`@hyperframes/aws-lambda: workspace:*` blows up with:
npm error code EUNSUPPORTEDPROTOCOL
npm error Unsupported URL Type "workspace:": workspace:*
(pnpm rewrites workspace:* on publish; npm pack doesn't.)
Three changes to unblock the smoke + keep the published CLI install
small for users who don't deploy to Lambda:
- Move `@hyperframes/aws-lambda` from CLI's `dependencies` back to
`devDependencies`. It's already external in tsup.config.ts; the
bundle references it via runtime resolution only.
- Convert the static `import { … } from "@hyperframes/aws-lambda/sdk"`
in sites.ts / render.ts / progress.ts to `await import()` inside
each function. tsup with `splitting: false` was inlining those
static imports at the top of the bundle, which made Node eagerly
resolve them at CLI startup (MODULE_NOT_FOUND before any lambda
subcommand even runs). Dynamic imports stay dynamic in the bundle.
- Add a friendly missing-module check in the lambda dispatcher.
When a user runs `hyperframes lambda deploy / render / sites /
progress / destroy` without aws-lambda installed, they now see:
@hyperframes/aws-lambda is not installed.
The `hyperframes lambda deploy` command needs it at runtime.
Install it alongside the CLI:
npm install -g @hyperframes/aws-lambda
Verified locally: pack + global install + `hyperframes init --example
blank` now succeeds end-to-end (was the same scenario the CI smoke job
runs).
IAM bootstrap subcommand for the lambda CLI. Closes the "first run hits
'User is not authorized to perform iam:CreateRole'" gap that adopters
otherwise have to figure out by hand.
hyperframes lambda policies user
→ prints an inline-policy doc to attach to the IAM user that runs
the CLI
hyperframes lambda policies role --principal=cloudformation
→ prints { TrustRelationship, InlinePolicy } for a service role
cloudformation can assume
hyperframes lambda policies validate ./infra/policy.json
→ diffs a checked-in policy against the CLI's required action set,
expanding s3:* / s3:Get* / * wildcards, exits non-zero on missing
actions (wire it into CI to catch drift before deploys fail)
The required-actions list is derived from what the SAM template at
examples/aws-lambda/template.yaml needs to create plus what
renderToLambda/getRenderProgress call against S3 + Step Functions at
runtime. Sorted alphabetically per-service so diffs stay readable.
Resource is "*" by design — CloudFormation creates new function /
state-machine / bucket ARNs on every adopter's first deploy. The
generated policy is documented as a starting point; adopters with
stricter postures narrow Resource to the deployed ARNs after the
first successful run.
Tests: 10 unit tests covering the action set, doc shape, trust policy
service principal, and validate() against valid / missing / wildcard /
single-Statement / Deny-statement inputs.
Adds a typed TrustPolicyDocument / TrustPolicyStatement pair so
buildRoleTrustPolicy can return a real type instead of unknown. The
trust-policy shape has a Principal field that the generic
PolicyStatement doesn't model, but it was previously punted via a
return unknown rather than a parallel type.
Test cleanup: drop the `as {...}` casts that the previous return-
unknown signature forced.
One blocker + four importants from Vai's review:
- REQUIRED_ACTIONS was missing `s3:ListAllMyBuckets` (called by
`sam deploy --resolve-s3` on first run to discover/create the
`aws-sam-cli-managed-default-*` artifact bucket) and
`cloudformation:ValidateTemplate` (CFN template validation
during change-set creation). Without these, a first-deploy
adopter with the generated policy hits AccessDenied on the
very call the PR was meant to unblock. Added both.
- `policies role --principal=lambda` was a footgun — it produced
a `lambda.amazonaws.com` trust paired with the full deploy
superset, i.e. a confusingly-overscoped Lambda execution role
no human should attach (the SAM template creates its own
scoped execution role automatically). Dropped `lambda` as a
principal option; `policies role` now always emits a
CloudFormation service-role doc.
- `validatePolicy` silently misreported NotAction/NotResource
statements (treating them as zero grants), producing false
negatives. Detect both shapes and surface them via a new
`warnings: string[]` field; NotAction statements are skipped
(rather than producing a false negative), NotResource is
treated as full action grant + a warning.
- Mid-string wildcards (`s3:Get*Object`, `?`) silently failed
the matcher. End-anchored wildcards still work; mid-string
patterns now warn so users know the validator can't expand
them.
- Dropped the dead `samArtifactBucket` action group (fully
subsumed by `s3Bucket` + `s3Object`).
- `validate --json` now wraps errors in a friendly envelope
(`{ ok: false, error: "..." }`) so CI consumers have one
parse shape regardless of failure mode.
- lambda.ts subcommand description and examples updated to
include `policies`.
Tests: 5 new negative-path tests cover NotAction warning,
NotResource warning, mid-string wildcard warning, missing file
(ENOENT), malformed JSON (SyntaxError), and absent Statement
field. All 21 policies tests pass.
Third harness mode that drives the OSS @hyperframes/aws-lambda handler
through the exact event sequence Step Functions produces in
production:
handler({Action: "plan"}) → planDir tarball on fake S3
handler({Action: "renderChunk"}) × N → chunk artifacts on fake S3
handler({Action: "assemble"}) → final mp4/mov/png-sequence
The S3 client is a filesystem-backed fake (every s3://<bucket>/<key>
URI maps to <tempRoot>/s3/<key>), so the harness exercises the
handler's event-parsing + tar/S3 conventions + dispatch logic on top
of the underlying producer primitives. Regressions in event JSON
shape, S3 key layout, or plan-hash boundary checks now surface in
the same CI run as the in-process and distributed-simulated modes
without paying for a real AWS round-trip.
Deliberately NOT a Docker/RIE invocation — that would gate the
producer test suite on Docker-in-Docker support which most CI
runners lack. Real-ZIP-via-RIE tests live in
packages/aws-lambda/scripts/ (probe:beginframe) and the
maintainer-run smoke.sh.
Wired up via:
- HarnessMode union extended to include "lambda-local"
- parseHarnessModeFlag accepts --mode=lambda-local
- regression-harness.ts dispatches to runLambdaLocalRender for
the new mode, sharing the distributed-support gate +
pathology-floor threshold with distributed-simulated mode
- package.json scripts: test:lambda-local + docker:test:lambda-local
- producer.devDependencies += @hyperframes/aws-lambda (workspace)
- producer/tsconfig.json gains path mappings to self so the type
cycle through aws-lambda's source resolves at typecheck time
without needing producer to be pre-built
Tests: 3 new unit tests on parseHarnessModeFlag + resolveMinPsnrForMode
cover the new mode. End-to-end PSNR contract still runs through
Dockerfile.test (manual + CI).
Three small cleanups on top of the lambda-local harness:
- Drop the unused createReadStream import + its `void` workaround
comment. The aws-lambda handler's tar / S3 transport pulls
createReadStream from its own imports; this file never references
it directly.
- Hoist the dynamic `await import("node:fs")` calls for
writeFileSync out of FilesystemBackedFakeS3.send into the static
import block. Repeated PutObject calls don't need to repay the
dynamic-import cost.
- Hoist the dynamic `await import("@hyperframes/aws-lambda")` call
for untarDirectory similarly. Drops the now-redundant duplicate
aws-lambda import statement.
The PutObject body branch also collapses: `body instanceof Buffer`
and `typeof body === "string"` both call writeFileSync identically,
so they share one branch.
No behavior changes.
The static import of regression-harness-lambda-local.ts pulled @hyperframes/aws-lambda (and its @aws-sdk/* + @sparticuz/chromium transitive deps) at module-load time. Dockerfile.test only copies the producer's own files into the container, so aws-lambda's src isn't present at runtime — and even `--mode=in-process` failed: Error [ERR_MODULE_NOT_FOUND]: Cannot find module '/app/packages/producer/node_modules/@hyperframes/aws-lambda/src/index.ts' imported from /app/packages/producer/src/regression-harness-lambda-local.ts Load the module on demand instead. `--mode=lambda-local` callers pay the import cost; the existing in-process and distributed- simulated modes don't.
Three review items from Vai:
- `Config.width`/`Config.height` are now plumbed through
RunLambdaLocalInput rather than hardcoded inside
runLambdaLocalRender. Lambda-local's whole point is to catch
event-shape drift; if the handler ever starts honouring
Config.width/height (e.g. for canvas sizing), having those
values flow from the caller means the harness sees what the
fixture authored. The interface change makes the eventual
upgrade-to-real-fixture-resolution a one-line dispatch swap.
- Drop the dead `export type { Fps }` and its unused import
from @hyperframes/core. The module never re-exports it.
- The dispatch site in regression-harness.ts now passes 1920×1080
explicitly with a comment marking it as a placeholder until
the harness compiles the composition HTML up-front to surface
the authored data-width/data-height. distributed-simulated
mode uses the same placeholder internally, kept for parity.
No behavior change in the existing modes; lambda-local now has a
clear extension point for honouring fixture dimensions.
7c1e316 to
e36eab9
Compare

What
Adds
--mode=lambda-localto the producer regression harness. Drives the OSS@hyperframes/aws-lambdahandler through the exact event sequence Step Functions produces in production, with a filesystem-backed fake S3 standing in for the real bucket.The harness now has three modes:
in-process(default)executeRenderJob— same path thehyperframes renderCLI takesdistributed-simulatedplan→renderChunk× N →assembleprimitives directlylambda-local(new)@hyperframes/aws-lambdahandler dispatch through plan / renderChunk / assemble events, with filesystem-backed fake S3Why
Per
DISTRIBUTED-RENDERING-PLAN.md§ 11 Phase 6b, this PR delivers PR 6.6: in-CI coverage that catches regressions in the event-JSON shape, S3 key conventions, and plan-hash boundary checks thatdistributed-simulateddoesn't touch. Today those bugs only surface when the maintainer-run real-AWS smoke (smoke.sh) runs — andsmoke.shcosts real money + AWS quota.How
HarnessModeunion extended to include"lambda-local".parseHarnessModeFlagaccepts--mode=lambda-local; error message lists all three modes.regression-harness.tsdispatches the new mode throughrunLambdaLocalRender. Shares the existingcheckDistributedSupportgate (webm / NTSC / HDR rejections skip cleanly) and theDISTRIBUTED_SIMULATED_MIN_PSNR_DBpathology floor withdistributed-simulated.packages/producer/src/regression-harness-lambda-local.ts— wires thehandlerimport to aFilesystemBackedFakeS3and walks plan/renderChunk/assemble in sequence. Pre-stages the project as atar.gzunder the fake-S3 root (mirroring whatdeploySitedoes in prod) and copies the final output back out to the path the harness expects.@hyperframes/aws-lambdaworkspace package. The runtime producer code does not import aws-lambda — the dep is harness-only.tsconfig.jsongains path mappings to itself (@hyperframes/producer→./src/index.ts,@hyperframes/producer/distributed→./src/distributed.ts) so the type cycle through aws-lambda's source resolves at typecheck time without producer being pre-built.Deliberate non-choice:
lambda-localis NOT a Docker / RIE invocation. That would gate the producer test suite on Docker-in-Docker support which most CI runners lack. Real-ZIP-via-RIE tests live inpackages/aws-lambda/scripts/probe-beginframe*and the maintainer-runsmoke.sh.Test plan
regression-harness-distributed.test.tscovering the new mode inparseHarnessModeFlag+resolveMinPsnrForModebun run --cwd packages/producer docker:test:lambda-local— runs alongside the existing modes in Dockerfile.test (maintainer-run before merging the stack; the implementation matchesdistributed-simulated's already-passing path)Stacks on #909, #910, and #912.
🤖 Generated with Claude Code