Skip to content

feat(idxdelta): incremental index maintenance — O(changed) USN apply (base + delta overlay)#483

Merged
githubrobbi merged 24 commits into
mainfrom
feat/incremental-index-maintenance
Jun 28, 2026
Merged

feat(idxdelta): incremental index maintenance — O(changed) USN apply (base + delta overlay)#483
githubrobbi merged 24 commits into
mainfrom
feat/incremental-index-maintenance

Conversation

@githubrobbi

Copy link
Copy Markdown
Collaborator

Incremental index maintenance — base + delta overlay (LSM-style)

Makes the per-poll USN journal apply scale with the number of changed
records instead of the total drive size. On a live 4.3M-record drive the
per-apply cost drops from 1367 ms → ~200 ms (−85%), and the derived
indexes (trigram / extension / children / path-lengths) update in O(changed)
with the O(total) work confined to a rare compaction.

How it works

Immutable base CSR indexes are Arc-shared; a small mutable IndexDelta
overlay (plus a tombstone set) absorbs each batch. Search reads base ∪ delta
minus tombstones; once the delta crosses a 50k threshold a compaction folds
it back into fresh bases.

Phase progression (all WIN-validated)

Phase What Result
1 Incremental compute_path_lengths 623 ms → ~O(changed)
2 Trigram base+delta overlay 338 ms → ~0
3 Arc-share base CSR (cheaper clone) 166 ms → 78 ms
4a / 4b Extension + children delta overlays 58/60 ms → 0
5 Apply cadence: debounce + max-wait snappy + CPU-bounded
6 Graduate dev instrumentation → perf guard

Latest WIN validation (under a recycle-bin purge storm)

4.3M-record drive, 260k changes coalesced into 22 applies:

  • children / ext overlays stayed 0 on every apply; paths < 0.3 ms.
  • Full O(total) trigram refold fired only on the 3 applies that crossed
    the 50k threshold.
  • Freshness: 1k/10k files searchable in 0.1 s, 100k in 0.5 s;
    rename visible in 0.0 s, delete leaves cleanly.

Phase 6 + doc hygiene (this PR's tail)

  • Stripped the IDXDELTA dev scaffolding; graduated the git build-stamp into
    the uffsd starting banner and the per-apply timing into a single
    usn apply: batch applied DEBUG summary.
  • Added crates/uffs-core/benches/apply_cost.rs — the committed cross-platform
    per-apply perf guard.
  • Fixed 142 broken intra-doc links across all 7 crates and closed the root
    cause: the rustdoc gate ran without --document-private-items, so it never
    validated the private surface. The gate now carries the flag at every layer
    (justfile, gates.toml + regenerated hook, pr-fast.yml, go/ship pipeline).

Design + tracking: docs/architecture/incremental-index-maintenance.md.

🤖 Generated with Claude Code

@githubrobbi githubrobbi enabled auto-merge June 27, 2026 23:43
githubrobbi and others added 24 commits June 27, 2026 16:46
…lta)

Phase 0 of the two-tier index project. The CSR indexes (trigram /
children / ext) are immutable read-optimized layouts, so "incremental
maintenance" is an LSM/Lucene-segment redesign — immutable base CSR +
mutable delta overlay + tombstones, queried as base ∪ delta minus
tombstones, with the existing full rebuild demoted to an occasional
compaction step. Turns apply from O(total records) into O(changed).

The doc specifies: architecture + per-op semantics, the search-path
integration choke points (trigram_search / children_of / records_with_ext),
phased delivery (trigram-first for the ~80% win), the mandatory oracle test
(base+delta must be observationally identical to a full rebuild, and
byte-identical after compaction), a baseline + timing-regression gate, the
removable IDXDELTA dev-instrumentation convention (build-id, per-apply /
per-search timing) mirroring USNFIX, the WIN dev test-script
(idx-delta-verify.rs), and a tracking table. Junior-dev-executable.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y timing

Scaffolding for the incremental-index-maintenance work (design:
docs/architecture/incremental-index-maintenance.md) — measure first, build
later. All dev-only, marked IDXDELTA, removable in Phase 5.

Which-build stamp: a uffs-daemon build.rs emits UFFS_GIT_SHA (short commit +
-dirty); startup logs `IDXDELTA build active version=… git=…`. The WIN
test-script fails fast if the running daemon lacks it — closing the
stale-binary trap we hit during USN testing.

Fine-grained per-apply timing (each meaningful step, not just the rebuild):
whole-body CLONE (shard.rs — the Arc-swap copies the entire index, the big
cost the rebuild timing alone misses and the one base+delta shrinks most),
per-change LOOP (the O(changed) mutation, timed apart), and REBUILD
(children / paths / trigram / ext, each separately). Logged in whole
microseconds (`*_us`, integers) — uffs-core denies float arithmetic, so this
respects that policy (and keeps sub-ms loop precision) rather than allow-ing
around it; the WIN script renders ms.

Refactor: the rebuild + IDXDELTA timing + batch-summary log move to a new
compact_loader/rebuild.rs submodule (cohesive O(n) step; keeps
compact_loader.rs under the 800-LOC policy; houses the temp timing for
Phase-5 removal). No behaviour change.

scripts/windows/idx-delta-verify.rs: the WIN rig (mirrors usn-verify.rs).
Confirms the build, drives escalating create bursts + a rename/delete smoke,
extracts the IDXDELTA-TIMING lines, writes _run/baseline.txt for regression
detection in later phases.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Phase-0 baseline (build 629966b, live C: = 3.89M records) overturned the
doc's original assumption that trigram was the ~80% win. Measured per-apply:

  compute_path_lengths 623ms   <- #1, bigger than trigram
  trigram rebuild      378ms
  whole-body clone     166ms   <- hidden by rebuild-only timing
  ext / loop / children 84/62/54ms
  FULL APPLY        ~1367ms   (not the ~600ms guessed)

Re-sequenced §4 phases by measured cost (biggest lever first):
  1. incremental compute_path_lengths (per-record + renamed-subtree Δ; NOT a
     base+delta overlay) — full §5.5 junior-dev guide added
  2. trigram delta   3. Arc-share the clone   4. ext+children delta
  5. unify + re-tune interval   6. remove IDXDELTA dev helpers

Adds the captured numbers as docs/architecture/baselines/incremental-index-
2026-06-26.json (the §8 regression reference) and marks the done Phase-0 items
(build stamp, timing, WIN rig) in §11.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the per-apply O(total-records) compute_path_lengths BFS (623ms,
the #1 cost in the measured baseline) with an O(changed) per-record
update for normal USN poll batches.

- compact.rs: PathChange{idx, subtree} + update_path_lengths_incremental
  + path_len_from_parent + shift_subtree_path_len (iterative DFS over the
  children CSR, propagating a directory-rename's length delta to the whole
  subtree, clamped to u16).
- apply_create / apply_rename thread &mut Vec<PathChange>; create/file-
  rename push a single O(1) record, directory-rename pushes subtree:true.
- rebuild.rs: rebuild children CSR FIRST (so the subtree walk sees current
  adjacency), then gate incremental-vs-full path update on a 50k batch
  threshold; cold loads (empty change set) still take the full BFS.
- Oracle gate (compact_loader_path_oracle_tests.rs): the incremental
  path_len must be byte-identical to a from-scratch compute_path_lengths
  rebuild across a batch of dir-rename + create + file-rename. Passes.

IDXDELTA-TIMING now reports paths_us for the incremental path so the WIN
rig can confirm the 623ms -> ~0 win against the committed baseline.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Incremental compute_path_lengths landed (9806bc3); path-len oracle
gate is green. Phase 2 (trigram delta) is next.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add the mutable delta-overlay type that the trigram / ext / children
base+delta search will read through (incremental-index-maintenance
§5.1). compact/delta.rs:

- IndexDelta { trigram, ext, children: FxHashMap<_, Vec<u32>>,
  tombstones: FxHashSet<u32>, touched_records }.
- add_record (sorted+deduped binary-search insert across all three
  posting maps; root u32::MAX parent adds no child posting),
  tombstone (idempotent), is_tombstoned, len/is_empty (compaction
  trigger), and the per-key postings accessors.
- The sorted/deduped posting invariant is what makes the eventual
  base∪delta merge a linear pass.

Unit-tested (sorted/dedup insert, root sentinel, idempotent tombstone,
rename-as-two-touches). The base∪delta sorted-merge primitive itself
lands in the Phase-2 commit wired directly into trigram_search, so it
is never dead scaffolding. No DriveCompactIndex field yet — that is
added in Phase 2 where each of the ~20 construction sites is touched
once, with the change that gives the field meaning.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…se 2

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The first WIN run exposed a 0.5 s-per-apply regression on small batches:
two applies (8 and 1 changes, loop_us=0) hit paths_us≈507 ms while every
create/rename batch was 1-18 µs. Those were delete-only batches — a
delete tombstones its record and pushes no PathChange, so path_changes is
empty, and the gate wrongly fell back to the full O(total) compute_path_
lengths BFS.

apply_usn_patch is never the cold-load path (build_compact_index does the
cold BFS directly), so an empty path_changes during apply means "no
record's path_len changed" → the correct work is none. A delete never
shifts any surviving record's path_len. Drop the is_empty() arm; the only
apply-time full-recompute fallback is now a >50k pathological batch.
update_path_lengths_incremental is already a no-op over an empty slice.

Oracle: add delete_only_batch_leaves_path_lengths_correct_without_full_
recompute. The shared assert now compares LIVE records only — a tombstoned
record's path_len is meaningless (incremental leaves it stale, a full BFS
recomputes it as a root); that divergence is correct and excluded.

Expected effect: mean paths drops from 145 ms to sub-ms; full_apply
~800 ms -> ~640 ms (trigram ~390 ms now dominant -> Phase 2).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…s HEAD

Kills the stale-binary trap for good. The rig now, before anything else:

- BIN SYNC: resolves the release dir cargo *actually* uses via
  `cargo metadata` target_directory (honours CARGO_TARGET_DIR /
  .cargo/*.toml build.target-dir; override with UFFS_RELEASE_DIR), then
  copies uffs/uffsd (+ uffs-broker/uffsmcp if built) into ~/bin, printing
  each binary's build mtime. Required bin missing → bail "build first".
- BUILD-ID MATCH GUARD: build-confirmation now extracts git="<sha>" from
  the IDXDELTA marker and asserts it equals `git rev-parse --short HEAD`;
  a resident daemon from an older build → hard fail with the fix.

So the WIN loop is just: build → run. No manual `copy C:\rust-target\...
\release\* ~/bin`. The target_directory JSON parse is a focused hand-scan
(no serde) that unescapes Windows `C:\\..` paths; unit-checked locally.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…mbing)

Routes every trigram caller through one DriveCompactIndex::trigram_search
that reads the base ∪ delta overlay. No behavior change yet: the delta is
always None until apply populates it (Phase 2b), so trigram_search is a
zero-overhead delegate to the base TrigramIndex::search.

- DriveCompactIndex gains `delta: Option<IndexDelta>` (None on fresh /
  compacted / cache-loaded; never serialized). All ~20 construction sites
  updated to `delta: None`.
- trigram_search: when a delta is present, merge per needle-trigram the
  base posting with the delta posting (delta::merge_postings), intersect
  (trigram::intersect_in_place, now pub(crate)), then resolve tombstones
  on the FINAL candidate set — keeping a tombstoned record only if it was
  re-added under a name covering every needle trigram. This is what lets a
  renamed file appear under its new name yet vanish from its old one;
  filtering per posting list would wrongly hide the re-added record.
- trigram.rs: extract the shared needle->trigram packing into
  needle_trigrams(); expose get_posting + intersect_in_place as pub(crate).
- delta.rs: merge_postings sorted-union (no tombstone — see above).
- Migrate the 3 trigram callers (tree, prefix_search, query) to
  trigram_search; each previously passed drive.fold, so behavior-identical.

Tests (compact_trigram_delta_tests.rs) pin the overlay semantics with a
manually-populated delta: create-visible, rename-visible-under-new-name +
gone-from-old, delete-invisible, short-needle None. 867/867 green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…dules

Pure, behavior-preserving split of the oversized compact.rs into cohesive
compact/ submodules. Every public item is re-exported so the canonical
`crate::compact::X` paths used across the workspace are unchanged — no call
site outside the module moved.

  compact/record.rs     CompactRecord + NTFS metafile-name allowlist   (189)
  compact/children.rs   ChildrenIndex (CSR parent→children)            (111)
  compact/extension.rs  ExtensionIndex (CSR ext_id→records)            (102)
  compact/path_len.rs   compute_path_lengths + Phase-1 incremental fns (214)
  compact/builder.rs    build_compact_index + ADS/links/shrink/upcase  (422)
  compact.rs            DriveCompactIndex + HeapReport + impl + re-exports (385)

compact.rs drops off the file-size exception list (was "13 over"; now 385,
well under 800). 867/867 uffs-core tests pass unchanged (identical count
pre/post — proves a pure move); clippy -D warnings, rustdoc -D warnings,
lint-ci-windows all clean.

This also tidies the tree for Phase 2b: the compact() method + apply
delta-population drop cleanly into builder.rs / a slim compact.rs rather
than a 1363-line file.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… rebuild

The 338 ms per-apply trigram rebuild is gone. apply_usn_patch now overlays
each batch onto the IndexDelta instead of rebuilding the base:

- DriveCompactIndex::apply_trigram_delta(adds, tombstones): adds each
  created/renamed record's new-name trigrams to the delta and masks the
  deleted/renamed-away/reused-slot records via tombstones. Folds back to a
  fresh base (compact_base) only when the delta crosses
  TRIGRAM_COMPACT_THRESHOLD (50k touched records) — so trigram_us is ~0 on
  normal applies, a one-off full build on the occasional compaction tick.
- compact_loader/apply.rs: the per-change mutation cluster (StagedCreate,
  stage_create, overwrite_slot, apply_{delete,create,rename}) extracted to
  a submodule; each apply fn now also collects the trigram tombstone set
  (deletes, renames, FRS-reuse overwrites). path_changes doubles as the
  trigram-ADD set. compact_loader.rs 826 -> 592 LOC.
- rebuild.rs: replace the TrigramIndex::build call with apply_trigram_delta;
  IDXDELTA-TIMING gains a `compacted` flag.

End-to-end oracle (compact_loader_trigram_oracle_tests.rs): a real
apply_usn_patch batch (create + rename + delete), then assert trigram_search
through base + delta returns IDENTICAL candidates to a compacted rebuild —
across created, renamed (new + old name), deleted, and untouched files.
868/868 green; clippy/rustdoc/file-size all clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per the request to stress beyond 1k files:
- BURSTS 10/100/1000 → 1000/10000/100000. The 100k burst crosses
  TRIGRAM_COMPACT_THRESHOLD (50k) so it also exercises a delta compaction
  (full trigram refold) under load; the smaller bursts measure steady-state
  delta-overlay apply (trigram_us ≈ 0).
- Replace the fixed-sleep freshness probe with poll_until_visible: polls a
  per-round filename prefix until that burst's `count` is search-visible (or
  a size-scaled budget elapses), so the report shows true creation
  throughput (files/s) AND apply-to-searchable latency, and flags an apply
  backlog instead of silently measuring the settle constant.

Also marks Phases 1/2a/2b + the compact.rs decomposition done in the
design-doc tracking table.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…n threshold

A burst larger than TRIGRAM_COMPACT_THRESHOLD (e.g. the verify rig's 100k
create) would populate the delta with 100k postings only to discard them at
the post-population compaction check — pure wasted work. apply_trigram_delta
now checks `pending_delta + batch_size > threshold` up front and, if so,
refolds the base directly via compact_base (the records already reflect every
change in the batch). This also catches the accumulation case where a small
batch tips an already-large delta over the line.

Reduces to compact_base (oracle-proven equivalent to a full rebuild), so the
end-of-fn compaction branch is now unreachable and removed. Trigram + path
oracles green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ounter

- Bin sync: a running uffs-broker (LocalSystem service) holds its exe open,
  so the best-effort optional copy hit os error 32 and aborted the whole run
  before any measurement. Optional-bin copy failures now warn ("skip …") and
  continue; only uffs + uffsd (the rig's actual dependencies) hard-fail.
- Remove the now-unused `total_created` accumulator (each burst polls its own
  per-round count) that tripped unused_variables/unused_assignments.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The stale-daemon guard compared the daemon's git SHA to HEAD verbatim, so a
HEAD that advanced purely through scripts/ or docs/ (e.g. a verify-rig tweak)
falsely flagged a current binary as stale and aborted the run. The guard now
diffs the daemon SHA against HEAD and only bails when a build-affecting path
changed (crates/**, Cargo.toml, Cargo.lock, rust-toolchain*); a non-source
advance prints "binary is current" and proceeds. Fail-safe: assumes stale if
git can't answer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The old mutate round searched `idx_0_1` to check a deleted file — but that
substring matches 111 bulk `idx_0_1*` files, so "expect 0" was a false
signal (the live run showed 111, which was correct-by-accident). Replace it
with `idxmutate_*` sentinels that share no trigram with the bulk files, and
poll-until-applied (visible / absent) instead of a fixed sleep:

- rename idxmutate_src → idxmutate_renamed: expect 'idxmutate_renamed' >= 1
- delete idxmutate_del: expect 'idxmutate_del' → 0
- old name idxmutate_src → 0 (renamed away)

Each now gives a clean pass/fail with the real apply latency. Drops the
now-unused SETTLE constant (every probe polls).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The daemon clones the whole DriveCompactIndex before each apply (lock-free
COW snapshot for readers). That deep-copied the immutable inverted indexes
— the trigram CSR alone is ~hundreds of MB on a multi-million-record drive.

Make trigram / children / ext_index `Arc<…>`. The apply path never mutates
them in place (it overlays on the delta and only ever *replaces* the whole
index at compaction/rebuild), so Arc + replace-the-pointer is a perfect fit:
the per-apply clone now pointer-clones these bases (a refcount bump) and only
deep-copies records + names + the small delta. Read sites are unchanged —
Arc derefs transparently through `.search()` / `.get_posting()` / `&drive.children`.

- compact.rs: field types → Arc; compact_base wraps the refold in Arc::new.
- rebuild.rs: the per-apply children/ext rebuilds wrap in Arc::new.
- builder.rs / compact_cache.rs / fixtures: construction sites wrap in Arc::new.
- New code uses alloc::sync::Arc (workspace lint convention); each touched
  file keeps its existing Arc import.

Expect `clone_us` to drop materially on the WIN baseline (the CSR portion of
the ~135ms clone). 868/868 uffs-core + 333/333 daemon green; clippy -D
warnings, rustdoc, lint-prod, lint-ci-windows, file-size all clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rebuild)

Drop the ~58 ms per-apply ExtensionIndex rebuild. `--ext` queries now read
through DriveCompactIndex::records_with_ext (base ∪ delta):

- records_with_ext(ext_id) -> Cow<[u32]>: zero-alloc borrow of the base CSR
  slice when delta is None; otherwise merges base + delta postings and
  validates each candidate against the live records (keep iff
  records[idx].extension_id == ext_id && name_len != 0). That records check
  is what makes a renamed extension (foo.log -> foo.pdf) and a delete correct
  WITHOUT an ext tombstone — a stale base posting just fails the check.
- apply_trigram_delta renamed apply_index_delta; it now always adds the
  ext + children postings (only the trigram postings stay gated on name >= 3
  chars), so a short-named create/rename is never missed by --ext.
- compact_base refolds the ext base too; rebuild.rs drops the ext rebuild
  (ext_us now ~0 in IDXDELTA-TIMING).
- Migrate the 3 ext readers (path_sorted / numeric / path_only top-N) to
  records_with_ext; the 3 post-apply ext unit tests assert through it.

Oracle extended: records_with_ext through the overlay equals the compacted
rebuild for every ext id, across create / rename / delete. 868/868 core +
333/333 daemon; clippy -D warnings, rustdoc, lint-prod, file-size all clean.

Children stays full-rebuilt — Phase 4b moves it onto the overlay (higher
care: it feeds FastPathResolver + the Phase-1 subtree walk).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ren rebuild)

Drop the ~60 ms per-apply ChildrenIndex rebuild — the last of the four
per-apply CSR rebuilds. Tree search + path resolution + the Phase-1 subtree
walk now read child adjacency through the base ∪ delta overlay:

- for_each_child(parent, visit): zero-alloc primitive — iterates the base CSR
  directly when delta is None; else sorted-merges base ∪ delta children
  (delta::merge_filter) and validates each against the live records (keep iff
  records[c].parent_idx == parent && live). The records check makes a moved-
  away/deleted child correct WITHOUT a children tombstone.
- children_of(parent) -> Cow: the slice form (Cow::Borrowed, zero-alloc, when
  delta is None) for callers needing a list.
- compact_base refolds the children base too; rebuild.rs drops the children
  rebuild (children_us now ~0).

Apply path REORDERED: apply_index_delta runs BEFORE the path walk, so the
Phase-1 directory-rename subtree walk sees this batch's new children (a child
created inside a renamed dir in the same batch). shift_subtree_path_len is now
two-pass (collect descendants over base ∪ delta, then shift path_len) to keep
the records read (parent_idx filter) off the path_len write.

Migrate the ~13 children readers (tree, query top-N, daemon info) to
for_each_child / children_of; the post-apply children unit tests read through
children_of.

Oracles: children_of overlay == compacted rebuild for every parent across a
file MOVE between directories + create + delete; and a directory rename WITH a
same-batch child create yields byte-identical path_len to a full rebuild (the
ordering guard). 870/870 core + 333/333 daemon; clippy/rustdoc/lint-prod/
lint-ci-windows/file-size all clean.

This completes Phase 4: all four per-apply CSR rebuilds (paths, trigram, ext,
children) are now incremental/overlay-served.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Measures the base ∪ delta overlay read cost directly — the Phase-4 concern:
does children_of / records_with_ext regress tree search / --ext under churn?
Each subject benched delta=None (compacted, Cow::Borrowed) vs churned
(delta=Some populated by a real ~40k-create apply), plus for_each_child
(zero-alloc) vs children_of (Cow).

Baseline (~500k records, ~40k-change delta):
  children_of (one dir)        2.1 ns  -> 631 ns   churned
  for_each_child (churned)               354 ns
  records_with_ext (hot ~100k) 1.8 ns  -> 295 us   churned
  tree walk (2000 dirs/~500k)  667 us  -> 2.09 ms  churned

Verdict: overhead is real but small in absolute terms — a whole-tree walk
stays ~2 ms under peak churn, and the records_with_ext tax on a hot extension
is dwarfed by downstream path-resolution of its ~100k results. So Phase 4's
overlay does not need the zero-alloc fix now; for_each_child (~1.8x faster than
children_of churned) is the ready lever if a future workload makes it bite.
Committed as a durable regression guard for the overlay read cost.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… CPU-bounded)

Now that an apply costs ~200 ms (Phases 1-4 made paths/trigram/ext/children
incremental), retire the 30 s fixed rate-limit on the per-shard apply trigger
and replace it with the file-watcher pattern: debounce + coalesce + max-wait.

- ApplyTrigger: was a single rate-limit (apply at most once per 30 s). Now two
  knobs — DEFAULT_APPLY_DEBOUNCE_MS (250 ms settle) and DEFAULT_APPLY_INTERVAL_MS
  (2 s max-wait). It tracks first_change_at (max-wait clock) + last_change_at
  (settle clock); evaluate fires when the burst SETTLED (quiet >= debounce, the
  snappy idle->active path) OR the run aged past max_wait (the CPU cap under
  sustained churn). record() drops its event-count arg (timing only).
- process_tick: evaluate the cadences EVERY poll, not only change-ticks — the
  settle apply and the age-based save must be able to fire on the first QUIET
  tick after a burst ends. Both evaluations are cheap no-ops when idle.
- JournalLoopConfig gains apply_debounce + UFFS_USN_APPLY_DEBOUNCE_MS override.

Regimes: idle -> 0 work; one saved file -> searchable in <1 s (debounce); a
finished unzip -> one coalesced apply; the live C: (constant churn) -> capped
at one ~200 ms apply / 2 s (~10% of a core) instead of thrashing.

Decomposition: the Phase-5 additions tipped journal_loop.rs to 810 LOC, so the
poll scheduling + backoff cluster (wait_for_next_tick, PollBackoff,
poll_blocking, log_poll_failure, MAX_POLL_BACKOFF) is extracted to a cohesive
journal_loop/poll.rs (full docs preserved); journal_loop.rs 810 -> 651, poll.rs
178, both under the 800 policy with no exception.

idx-delta-verify rig pins the new knobs (max-wait 2 s, debounce 250 ms). The 5
ApplyTrigger unit tests rewritten for the settle/cap semantics (+ a max-wait-cap
test); all journal_loop config fixtures carry the new field. 334/334 daemon;
clippy/rustdoc/file-size clean. Design-doc tracking marks Phases 1-5 done.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Strip the temporary IDXDELTA dev scaffolding now that the incremental-
index-maintenance algorithm (Phases 1-5) is WIN-validated, keeping the two
pieces worth keeping as permanent facilities.

* Build stamp: fold the git SHA into the existing `uffsd starting` banner
  (build.rs de-branded, kept) instead of a separate IDXDELTA marker.
* Per-apply timing: replace the per-step IDXDELTA-TIMING lines (and the
  whole-body clone marker) with a single `usn apply: batch applied` DEBUG
  summary in compact_loader/rebuild.rs (changes/created/deleted/renamed/
  skipped/records/ext_index_entries/compacted/apply_us).
* Perf guard: add crates/uffs-core/benches/apply_cost.rs, a cross-platform
  Criterion bench timing the apply alone (creates/256, creates/4000,
  mixed/4000, deletes/4000) against a ~500k-record fixture. Pairs with the
  existing overlay_read.rs to lock in O(changed) per-apply cost.
* Retarget scripts/windows/idx-delta-verify.rs onto the graduated logs.
* Mark the design doc Phases 1-6 complete.

WIN-validated on a live 4.3M-record drive under heavy churn (recycle-bin
purge + bursts, 260k changes coalesced into 22 applies): children/ext
overlays stayed 0 and paths under 0.3ms on every apply; the full O(total)
trigram refold fired only on the 3 applies that crossed the 50k threshold.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Resolve every broken intra-doc link across the 7 crates (142 in total),
and fix the gate that let them rot.

Root cause: the `rustdoc` gate ran `cargo doc` WITHOUT
`--document-private-items`, so it only validated links reachable from the
public API surface. A broken `[`crate::path::pub_crate_item`]` (or a `//!`
shortcut to a private sibling) silently rendered as dead text instead of
failing the build.

Link fixes are real targets where one exists (fully-qualified
`crate::path::Item` / cross-crate paths, type-canonical method paths,
struct re-export paths). Where the target genuinely is not a doc-build
link, the reference becomes an honest code span instead:
  * `#[cfg(windows)]` items absent on the macOS/Linux doc host
  * `#[cfg(test)]` test-only fakes and modules
  * dev-dependency types (e.g. `tempfile::TempDir`) and env-var strings
  * struct fields, lint names, and fully-private cross-module fns

Gate, fixed at every layer so it cannot regress:
  * `just rustdoc` recipe gains `--document-private-items`, and is wired
    into the phase1 validation workflow (it was defined but never run).
  * `scripts/ci/gates.toml` manifest updated; pre-push hook regenerated.
  * `.github/workflows/pr-fast.yml` docs job updated.
  * `go` / `ship` ci-pipeline (`phases.rs` + `ship.rs`) updated.
All three drift detectors (gates / hooks / workflow) pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@githubrobbi githubrobbi force-pushed the feat/incremental-index-maintenance branch from 3fad5c4 to d982737 Compare June 27, 2026 23:49
@githubrobbi githubrobbi added this pull request to the merge queue Jun 27, 2026
Merged via the queue into main with commit f829e4b Jun 28, 2026
29 checks passed
@githubrobbi githubrobbi deleted the feat/incremental-index-maintenance branch June 28, 2026 00:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant