Skip to content

feat(webgl): detect GPU out-of-memory and emit an outOfMemory event#54

Merged
chiefcll merged 1 commit into
mainfrom
feat/gpu-out-of-memory-event
Jun 4, 2026
Merged

feat(webgl): detect GPU out-of-memory and emit an outOfMemory event#54
chiefcll merged 1 commit into
mainfrom
feat/gpu-out-of-memory-event

Conversation

@chiefcll
Copy link
Copy Markdown
Contributor

@chiefcll chiefcll commented Jun 4, 2026

Problem

On image-heavy pages, some devices log a flood of Could not create WebGL Texture (one device at a time). The root cause isn't createTexture running out of memory — createTexture only allocates a handle. It's a lost context: a texImage2D GPU out-of-memory escalates to context loss, after which every gl.createTexture() returns null, producing the flood.

The renderer had no way to detect the originating GPU out-of-memory. WebGL only exposes it via gl.getError() (GL_OUT_OF_MEMORY), which this engine deliberately checks only in dev builds because getError() forces a CPU/GPU sync. So in production the OOM was invisible until it had already become a dead context.

What this adds

A GPU out-of-memory probe surfaced as an application event, so the app can recover (e.g. reload with a lower criticalThreshold).

  • WebGlRenderer.checkForOutOfMemory() — drains the GL error queue (bounded to keep the sync cost fixed) and returns whether GL_OUT_OF_MEMORY was seen. checkForOutOfMemory is now a required CoreRenderer method; CanvasRenderer returns false (no GPU OOM signal on Canvas2D).
  • Runs at the idle transition, not every frame. GL errors accumulate and persist until drained, so a single check when the render burst settles still catches any OOM raised during the active frames — without paying the getError() sync per frame. (Tradeoff: an app that never goes idle won't fire the probe until it settles; the error stays queued until then.)
  • TextureMemoryManager.handleOutOfMemory() queues an outOfMemory frame event ({ memUsed, criticalThreshold }) and requests a best-effort cleanup of non-renderable textures.

Policy lives in the app, not the renderer

The renderer only detects and reports. Persisting/lowering the threshold and reloading is application policy. RendererMainOutOfMemoryEvent carries a full recommended-integration snippet in its docs:

  • Read a calibrated criticalThreshold from localStorage on startup, namespaced per app (important for file:// TV deployments where the origin is null/opaque and a bare key collides across apps).
  • On outOfMemory, lower to 90% of the measured ceiling (min(memUsed, criticalThreshold)) with a floor, persist, and location.reload().

memUsed at the time of the failure is a measured ceiling — the real GPU budget is at or below it — which is why it's a good basis for the next threshold.

Reviewer notes

  • emit invokes listeners as (target, data); the new event docs and examples use that signature.
  • No persistence/localStorage code in the renderer itself — intentionally pushed to the app.
  • checkForOutOfMemory being abstract means a future backend that forgets to implement it is a build error, not a silent skip.

Tests

  • TextureMemoryManager.test.tsoutOfMemory event payload, cleanup request, threshold left unchanged, fresh estimate per fire.
  • WebPlatform.outOfMemory.test.ts — probe fires once at the idle transition, handles OOM when reported, and does not probe on an active frame.
  • Full suite: 262 passing. Build, prettier, eslint clean.

🤖 Generated with Claude Code

Some devices log a flood of "Could not create WebGL Texture" on
image-heavy pages: a texImage2D OOM escalates to a lost context, after
which every gl.createTexture() returns null. The renderer had no way to
detect the originating GPU out-of-memory — getError() is only checked in
dev builds because it forces a CPU/GPU sync.

Add a once-per-loop OOM probe and surface it as an application event so
the app can recover (e.g. reload with a lower criticalThreshold):

- WebGlRenderer.checkForOutOfMemory() drains the GL error queue (bounded)
  and reports whether GL_OUT_OF_MEMORY was seen. checkForOutOfMemory is
  now a required CoreRenderer method; CanvasRenderer returns false.
- The probe runs at the idle transition (end of a render burst), not
  every frame. GL errors accumulate and persist until drained, so a
  single check still catches any OOM raised during the active frames
  without paying the getError() sync per frame.
- TextureMemoryManager.handleOutOfMemory() queues an `outOfMemory` frame
  event ({ memUsed, criticalThreshold }) and requests a best-effort
  cleanup. Persistence/threshold-lowering/reload is left to the app;
  RendererMainOutOfMemoryEvent documents the recommended integration
  (read calibrated threshold from localStorage, namespaced per app for
  file:// deployments; lower to 90% of the measured ceiling with a floor;
  reload).

Tests: TextureMemoryManager event behavior and the idle-path probe
(fires at idle, not on active frames).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chiefcll chiefcll merged commit fb7a447 into main Jun 4, 2026
1 check passed
@chiefcll chiefcll deleted the feat/gpu-out-of-memory-event branch June 4, 2026 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant