Skip to content

fix(usn): coalesce a backlog of apply ticks into one body rebuild#481

Merged
githubrobbi merged 1 commit into
mainfrom
fix/usn-apply-coalesce
Jun 26, 2026
Merged

fix(usn): coalesce a backlog of apply ticks into one body rebuild#481
githubrobbi merged 1 commit into
mainfrom
fix/usn-apply-coalesce

Conversation

@githubrobbi

Copy link
Copy Markdown
Collaborator

Problem

Each USN apply does a full O(n) index rebuild (children CSR + trigram + ext index, ~600ms on a ~4M-record drive). When apply ticks fire faster than that drains — a small UFFS_USN_APPLY_INTERVAL_MS on a huge, busy volume — the applier channel backs up and the pipeline latency creeps to seconds, so a freshly renamed/deleted file can read stale until the backlog clears.

Surfaced by the verify harness pinning a 500ms apply interval (below the ~600ms rebuild cost) on a 3.76M-record C: — Round 2 (rename/delete) intermittently read pre-change state. The delete/rename logic was correct (the daemon logged deleted=true mapped=true / renamed=true mapped=true); the body just hadn't swapped before the search ran.

Fix (cheap robustness guard)

The applier coalesces a run of consecutive same-letter Apply messages into a single rebuild (coalesce_apply_run):

  • greedily drains queued same-letter applies via try_recv, appending changes in FIFO order (create → delete → reuse-FRS sequences still apply correctly);
  • stops at the first different-letter Apply / Save / Wrap and carries it forward intact — a Save is never merged (it keeps its cursor + persistence semantics).

This bounds the apply rate to the rebuild rate regardless of the interval — the system self-throttles instead of piling up rebuilds.

This is the cheap guard. The real fix for the per-apply O(n) cost is incremental base + delta + tombstone index maintenance (the CSR indexes are immutable/read-optimized, so it's an LSM-style two-tier redesign) — tracked as a separate, scoped project.

Tests

3 unit tests for coalesce_apply_run (merge same-letter run, stop at other letter, never swallow a Save). Full daemon suite + Windows-target clippy green.

Harness

scripts/windows/usn-verify.rs: POLL_SETTLE 3s→6s (the full pipeline is a couple seconds on a busy drive) and the apply pin 500→1500ms (above the rebuild cost), so the verification is deterministic.

🤖 Generated with Claude Code

Each USN apply does a full O(n) index rebuild (children CSR + trigram +
ext index, ~600ms on a ~4M-record drive). When apply ticks fire faster
than that drains — e.g. a small UFFS_USN_APPLY_INTERVAL_MS on a huge,
busy volume — the applier channel backs up and the pipeline latency
creeps to seconds, so a freshly renamed/deleted file can read stale until
the backlog clears. (Surfaced by the verify harness pinning a 500ms apply
interval, below the rebuild cost, on a 3.76M-record C:.)

The applier now coalesces a run of consecutive same-letter Apply messages
into a single rebuild (coalesce_apply_run): it greedily drains queued
same-letter applies via try_recv, appends their changes in FIFO order
(create -> delete -> reuse sequences still apply correctly), and stops at
the first different-letter Apply / Save / Wrap, carrying it forward intact
(Save keeps its cursor + persistence semantics, never merged). This bounds
the apply rate to the rebuild rate regardless of the interval — the system
self-throttles instead of piling up rebuilds.

This is the cheap robustness guard; the real fix for the per-apply O(n)
cost is incremental (base + delta + tombstone) index maintenance, tracked
separately.

scripts/windows/usn-verify.rs: raise POLL_SETTLE 3s -> 6s and the apply
pin 500 -> 1500ms (above the rebuild cost) so the test is deterministic on
a busy multi-million-record drive.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@githubrobbi githubrobbi enabled auto-merge June 26, 2026 17:11
@githubrobbi githubrobbi added this pull request to the merge queue Jun 26, 2026
Merged via the queue into main with commit 0438dab Jun 26, 2026
21 checks passed
@githubrobbi githubrobbi deleted the fix/usn-apply-coalesce branch June 26, 2026 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant