Skip to content

fix(cli): keep browser GPU opt-in for local renders#830

Closed
miguel-heygen wants to merge 1 commit into
mainfrom
fix/m1-render-compositor-shift
Closed

fix(cli): keep browser GPU opt-in for local renders#830
miguel-heygen wants to merge 1 commit into
mainfrom
fix/m1-render-compositor-shift

Conversation

@miguel-heygen
Copy link
Copy Markdown
Collaborator

Problem

Fixes #828.

npx hyperframes render was routing plain local renders through host Chrome GPU capture by default. On Apple Silicon machines where the WebGL probe succeeds, that means the CLI can select ANGLE/Metal for screenshot capture, which matches the reporter's symptom: Studio preview is stable, Intel is stable, but local M1 CLI renders can vertically shift after the 12s transition and later transitions.

What this fixes

  • Keeps no-flag local CLI renders on software/SwiftShader browser capture by default, matching the engine's conservative default.
  • Leaves --browser-gpu as the explicit host-GPU opt-in.
  • Leaves PRODUCER_BROWSER_GPU_MODE=auto available for callers who intentionally want probe-and-fallback behavior.
  • Updates CLI help, CLI package docs, producer docs, and rendering guide docs so users no longer treat browser GPU capture as the default path.

Root cause

The engine default is already browserGpuMode: "software", but the CLI resolver changed no-flag local renders to auto. On Apple Silicon, auto can resolve to hardware when the WebGL probe succeeds, causing Chrome to use the host GPU compositor for frame screenshots. That makes deterministic DOM capture depend on local Chrome/ANGLE/driver behavior, and it explains why the CLI render path can diverge from Studio preview and non-Apple-Silicon machines.

This patch removes the CLI-level override and keeps hardware browser capture opt-in instead of platform-default.

Verification

Local

  • bunx oxfmt --check packages/cli/src/commands/render.ts packages/cli/src/commands/render.test.ts packages/cli/src/docs/rendering.md docs/packages/cli.mdx docs/packages/producer.mdx docs/guides/rendering.mdx
  • bunx oxlint packages/cli/src/commands/render.ts packages/cli/src/commands/render.test.ts
  • bun run --filter @hyperframes/cli test -- src/commands/render.test.ts
  • bun run --filter @hyperframes/cli typecheck
  • Pre-fix diagnosis on current main with the issue repro entered the hardware route: [hyperframes] browserGpuMode auto → hardware (WebGL probe succeeded).
  • Patched no-flag CLI render completed against the issue repro:
    • KEEP_TEMP=1 PRODUCER_ENABLE_STREAMING_ENCODE=false bun run --filter @hyperframes/cli dev -- render /tmp/hf-828-repro.6EpLFG/repo --workers 1 --quality draft --output /tmp/hf-828-repro.6EpLFG/local-default-software-after-fix.mp4
    • ffprobe verified 1920x1080, 30/1 fps, 30.000000 seconds, 900 frames.
    • Sampled extracted frames at 11.8s, 12.0s, 12.2s, 20.0s, and 29.0s; bright foreground minY stayed at 290/291 on this machine.

Browser

  • Used agent-browser against a local proof page backed by frames extracted from the patched render.
  • agent-browser loaded the proof page, clicked frame controls for 12.2s, 20.0s, and 29.0s, and reported no browser page errors.
  • Local proof artifacts:
    • qa-artifacts/issue-828/browser-proof-frame-12-2s.png
    • qa-artifacts/issue-828/browser-proof-frame-20s.png
    • qa-artifacts/issue-828/browser-proof-frame-29s.png
    • qa-artifacts/issue-828/browser-proof-frames-recording.webm

Notes

  • I could not reproduce the exact reporter M1 vertical compositor shift on this local Apple Silicon machine; this host is a newer Apple Silicon model, not the reporter's M1. The root routing issue was still validated: no-flag CLI renders on current main select hardware when the probe succeeds, and this change prevents that route by default.
  • The reporter repro had a malformed GSAP script URL (jsdelivr.net without a scheme). I corrected only the temporary local repro copy to use the CDN URL so the composition would execute during diagnosis.
  • No tracked composition HTML changed, so repo composition npx hyperframes lint / npx hyperframes validate was not applicable.

@mintlify
Copy link
Copy Markdown

mintlify Bot commented May 14, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
hyperframes 🟢 Ready View Preview May 14, 2026, 5:18 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary: Focused, well-scoped fix for #828. CLI was overriding the engine's conservative browserGpuMode: "software" default with "auto", so on hosts where the WebGL probe succeeded (Apple Silicon in particular) plain npx hyperframes render routed through host Chrome/ANGLE/Metal for screenshot capture — exactly the path the reporter hit. Pulling the CLI default back to "software", keeping --browser-gpu and PRODUCER_BROWSER_GPU_MODE=auto as explicit opt-ins, is the right shape: aligns CLI with engine default, restores cross-platform determinism, and the workaround "render on Intel Mac" reduces to "you no longer need to."

The parity framing in this PR isn't preview-vs-render (Studio preview runs in the user's interactive Chrome, not the engine resolver) — it's render-vs-render across hosts. Same machine, two days apart, two different drivers → two different MP4s. That non-determinism is what this fix actually kills.

Strengths:

  • packages/cli/src/commands/render.ts:745-755 resolver priority order (Docker → CLI flag → env → default) is clean and the explicit overrides are preserved end-to-end.
  • packages/cli/src/commands/render.test.ts:123-136 test matrix locks the new default (software) and verifies env + explicit flags still win — exactly the regression surface that needs pinning.
  • Engine consistency confirmed: packages/engine/src/config.ts:146 DEFAULT_CONFIG.browserGpuMode = "software", and the distributed render paths (packages/producer/src/services/distributed/renderChunk.ts:360, plan.ts:399/453/467) already hard-code "software". The CLI was the lone outlier; this fix removes the asymmetry.

Findings:

importantpackages/cli/src/commands/validate.ts:157-158
The sibling CLI command hyperframes validate decides browser GPU mode independently and defaults to "hardware" when PRODUCER_BROWSER_GPU_MODE isn't set to "software":

const browserGpuMode =
  process.env.PRODUCER_BROWSER_GPU_MODE === "software" ? "software" : "hardware";

This is exactly the contract violation that bit render on Apple Silicon — a CLI surface picking hardware over the engine's conservative default. validate doesn't produce the user-visible artifact, so blast radius is smaller (false-negative QA checks if ANGLE renders text differently), but the failure mode is identical: contrast / console-error checks on M1 disagree with what the rendered MP4 actually looks like. Fix: route through resolveBrowserGpuForCli(false, undefined, process.env.PRODUCER_BROWSER_GPU_MODE) so both CLI commands share one decision. Worth a follow-up PR, not a blocker for this one.

important — silent perf regression for users on capable hardware
The PR body documents the routing fix but doesn't call out the user-visible trade-off: someone running npx hyperframes render on an NVIDIA / Apple M3 box used to get hardware-accelerated capture for free; they now get SwiftShader and a materially slower render unless they pass --browser-gpu. Worth a changelog/release-note line and ideally a one-time stderr hint when the user has no flag set ("Tip: pass --browser-gpu if you want host GPU capture; default is software for cross-platform determinism"). Without it, the next bug report is "render got 2x slower after I upgraded."

nitpackages/cli/src/commands/render.ts:407-417
The render-plan summary only prints the GPU line when browserGpuMode !== "software". With the new default, the user gets no indication of which capture mode they're on. Consider always printing browser GPU (software, default) / (hardware, forced) / (auto-detect) so the mode is visible without having to know what the default is. Trivial; not blocking.

nitpackages/cli/src/docs/rendering.md and docs/guides/rendering.mdx
The --browser-gpu table row's Default column now reads software, but the column header in rendering.mdx:127 is Defaultsoftware is a value, not a default-state like on locally, off in Docker it replaced. Fine as-is, but if you want consistency with the other rows, software (local), software (Docker) makes the asymmetry explicit.

Verdict: Approve. Zero blockers, CI green including Windows render verification, the fix is small and lands on the right seam. Ship it; file the validate.ts consistency cleanup as a follow-up.

— Vai

@miguel-heygen
Copy link
Copy Markdown
Collaborator Author

Investigated the code path. The CLI's resolveBrowserGpuForCli() was the only layer overriding the engine's conservative software default to "auto", which resolved to "hardware" on M1 via Metal/ANGLE. SwiftShader is the only fully deterministic capture path — partial flags like --disable-gpu-compositing are unreliable across Chrome versions on M1. Fix looks correct.

+1 to Vai's follow-up note: validate.ts:157 has the same shape of bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Apple Silicon M1: CLI render has vertical compositor shift after ~12s; Studio preview and Intel render are stable

2 participants