Skip to content

feat(renderer): on-disk cache for per-mesh SDFs (#22)#50

Open
proggeramlug wants to merge 1 commit intomainfrom
feat/sdf-disk-cache-22
Open

feat(renderer): on-disk cache for per-mesh SDFs (#22)#50
proggeramlug wants to merge 1 commit intomainfrom
feat/sdf-disk-cache-22

Conversation

@proggeramlug
Copy link
Copy Markdown
Contributor

Summary

Cold launches re-baked every per-mesh 32³ R32Float SDF from scratch — Sponza's first 9 frames spent ~8 bakes/frame on this path, then the next launch spent another 9 on the same content. This PR content-hashes (positions + indices) at upload, checks a platform cache dir, and `queue.write_texture`s the cached voxels directly when the file exists. Misses fall through to the existing bake; the renderer encodes a `copy_texture_to_buffer` alongside each dispatch and persists the readback to disk after the frame's main submit. Next launch hits and skips the bake.

Closes #22.

Cache layout

  • macOS / iOS / tvOS / watchOS: `~/Library/Caches/bloom/sdf`
  • Linux / Android: `$XDG_CACHE_HOME/bloom/sdf`
  • Windows: `%LOCALAPPDATA%\bloom\cache\sdf`
  • File: 16 B header (magic `BLSDF\0` + version + voxel_res) + 128 KB R32Float payload
  • Sponza on-disk footprint: 68 × 128 KB = 8.7 MB (matches issue budget)
  • wasm32: `cache_dir()` returns `None`; web builds bake every launch as before

Behavior on the hot/cold paths

  • Cache hit (warm launches): zero GPU bake work for cached meshes, padded `write_texture` upload only (~sub-ms per mesh on SSD).
  • Cache miss (cold launch / asset edit): bake runs as before, plus `copy_texture_to_buffer` on the same encoder. After submit, a single `device.poll(Wait)` covers all that frame's readbacks; the buffers unpad row-aligned data back to the on-disk layout and write. Stalls only on the ~9 cold-launch bake frames Sponza already spends on this path.
  • Best-effort throughout: corrupt entry, missing dir, write failure → silent re-bake.

Test plan

  • 8 new unit tests (hash stability, position/index change detection, count-vs-value distinguishability, store/load round-trip, miss handling, size validation, bad-magic rejection)
  • `cargo test --release --lib` 74/0 on macOS (was 66/0)
  • `cargo check --target wasm32-unknown-unknown --no-default-features --features web` clean
  • Manual cold-launch run on `examples/intel-sponza`: first launch creates ~68 cache files; second launch starts faster + cache files unchanged
  • CI Linux + Windows pass

Cold launches re-baked every per-mesh 32³ R32Float SDF from scratch
even though the same content always produces the same voxel data.
Sponza's first 9 frames spent ~8 bakes/frame on this path; second
launch spent another 9.

This change content-hashes (positions + indices) at GPU upload time,
checks a platform-appropriate cache directory, and `queue.write_texture`s
the cached voxel bytes directly when the file exists — bypassing the
GPU dispatch entirely. Misses fall through to the existing bake; the
renderer encodes a copy_texture_to_buffer alongside each dispatch
and persists the readback to disk after the frame's main submit.
The next launch hits and skips the bake.

The cache is best-effort throughout: a corrupt entry, missing dir,
or write failure silently re-bakes. wasm32 has no filesystem path so
load returns None and store is gated out — web builds bake every
launch as before.

Cache layout:
- macOS / iOS / tvOS / watchOS: ~/Library/Caches/bloom/sdf
- Linux / Android:              $XDG_CACHE_HOME/bloom/sdf
- Windows:                      %LOCALAPPDATA%\bloom\cache\sdf
- 16 B header (magic + version + voxel_res) + 128 KB R32Float payload

Sponza disk footprint: 68 × 128 KB = 8.7 MB (matches the issue's
budget). Disk reads happen synchronously at upload — a 128 KB read
from local cache is sub-millisecond.

The synchronous device.poll(Wait) on flush blocks for the bake
submission to finish before persisting; this is a cold-launch-only
stall (~9 frames) and the bake itself is the bottleneck on those
frames anyway. Async pipelining is a follow-up if the cold-launch
stall ever shows up in profiles.

8 new unit tests cover hash stability, change-detection on positions
and indices, count-vs-value distinguishability, store/load round-trip,
miss handling, size validation, and bad-magic rejection. cargo test
74/0 (was 66/0) on macOS, wasm32 cargo check clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lumen follow-up: disk cache for per-mesh SDFs

1 participant