fix(usn): coalesce a backlog of apply ticks into one body rebuild#481
Merged
Conversation
Each USN apply does a full O(n) index rebuild (children CSR + trigram + ext index, ~600ms on a ~4M-record drive). When apply ticks fire faster than that drains — e.g. a small UFFS_USN_APPLY_INTERVAL_MS on a huge, busy volume — the applier channel backs up and the pipeline latency creeps to seconds, so a freshly renamed/deleted file can read stale until the backlog clears. (Surfaced by the verify harness pinning a 500ms apply interval, below the rebuild cost, on a 3.76M-record C:.) The applier now coalesces a run of consecutive same-letter Apply messages into a single rebuild (coalesce_apply_run): it greedily drains queued same-letter applies via try_recv, appends their changes in FIFO order (create -> delete -> reuse sequences still apply correctly), and stops at the first different-letter Apply / Save / Wrap, carrying it forward intact (Save keeps its cursor + persistence semantics, never merged). This bounds the apply rate to the rebuild rate regardless of the interval — the system self-throttles instead of piling up rebuilds. This is the cheap robustness guard; the real fix for the per-apply O(n) cost is incremental (base + delta + tombstone) index maintenance, tracked separately. scripts/windows/usn-verify.rs: raise POLL_SETTLE 3s -> 6s and the apply pin 500 -> 1500ms (above the rebuild cost) so the test is deterministic on a busy multi-million-record drive. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Each USN apply does a full O(n) index rebuild (children CSR + trigram + ext index, ~600ms on a ~4M-record drive). When apply ticks fire faster than that drains — a small
UFFS_USN_APPLY_INTERVAL_MSon a huge, busy volume — the applier channel backs up and the pipeline latency creeps to seconds, so a freshly renamed/deleted file can read stale until the backlog clears.Surfaced by the verify harness pinning a 500ms apply interval (below the ~600ms rebuild cost) on a 3.76M-record
C:— Round 2 (rename/delete) intermittently read pre-change state. The delete/rename logic was correct (the daemon loggeddeleted=true mapped=true/renamed=true mapped=true); the body just hadn't swapped before the search ran.Fix (cheap robustness guard)
The applier coalesces a run of consecutive same-letter
Applymessages into a single rebuild (coalesce_apply_run):try_recv, appending changes in FIFO order (create → delete → reuse-FRS sequences still apply correctly);Apply/Save/Wrapand carries it forward intact — aSaveis never merged (it keeps its cursor + persistence semantics).This bounds the apply rate to the rebuild rate regardless of the interval — the system self-throttles instead of piling up rebuilds.
Tests
3 unit tests for
coalesce_apply_run(merge same-letter run, stop at other letter, never swallow aSave). Full daemon suite + Windows-target clippy green.Harness
scripts/windows/usn-verify.rs:POLL_SETTLE3s→6s (the full pipeline is a couple seconds on a busy drive) and the apply pin 500→1500ms (above the rebuild cost), so the verification is deterministic.🤖 Generated with Claude Code