feat(idxdelta): incremental index maintenance — O(changed) USN apply (base + delta overlay)#483
Merged
Merged
Conversation
…lta) Phase 0 of the two-tier index project. The CSR indexes (trigram / children / ext) are immutable read-optimized layouts, so "incremental maintenance" is an LSM/Lucene-segment redesign — immutable base CSR + mutable delta overlay + tombstones, queried as base ∪ delta minus tombstones, with the existing full rebuild demoted to an occasional compaction step. Turns apply from O(total records) into O(changed). The doc specifies: architecture + per-op semantics, the search-path integration choke points (trigram_search / children_of / records_with_ext), phased delivery (trigram-first for the ~80% win), the mandatory oracle test (base+delta must be observationally identical to a full rebuild, and byte-identical after compaction), a baseline + timing-regression gate, the removable IDXDELTA dev-instrumentation convention (build-id, per-apply / per-search timing) mirroring USNFIX, the WIN dev test-script (idx-delta-verify.rs), and a tracking table. Junior-dev-executable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y timing Scaffolding for the incremental-index-maintenance work (design: docs/architecture/incremental-index-maintenance.md) — measure first, build later. All dev-only, marked IDXDELTA, removable in Phase 5. Which-build stamp: a uffs-daemon build.rs emits UFFS_GIT_SHA (short commit + -dirty); startup logs `IDXDELTA build active version=… git=…`. The WIN test-script fails fast if the running daemon lacks it — closing the stale-binary trap we hit during USN testing. Fine-grained per-apply timing (each meaningful step, not just the rebuild): whole-body CLONE (shard.rs — the Arc-swap copies the entire index, the big cost the rebuild timing alone misses and the one base+delta shrinks most), per-change LOOP (the O(changed) mutation, timed apart), and REBUILD (children / paths / trigram / ext, each separately). Logged in whole microseconds (`*_us`, integers) — uffs-core denies float arithmetic, so this respects that policy (and keeps sub-ms loop precision) rather than allow-ing around it; the WIN script renders ms. Refactor: the rebuild + IDXDELTA timing + batch-summary log move to a new compact_loader/rebuild.rs submodule (cohesive O(n) step; keeps compact_loader.rs under the 800-LOC policy; houses the temp timing for Phase-5 removal). No behaviour change. scripts/windows/idx-delta-verify.rs: the WIN rig (mirrors usn-verify.rs). Confirms the build, drives escalating create bursts + a rename/delete smoke, extracts the IDXDELTA-TIMING lines, writes _run/baseline.txt for regression detection in later phases. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Phase-0 baseline (build 629966b, live C: = 3.89M records) overturned the doc's original assumption that trigram was the ~80% win. Measured per-apply: compute_path_lengths 623ms <- #1, bigger than trigram trigram rebuild 378ms whole-body clone 166ms <- hidden by rebuild-only timing ext / loop / children 84/62/54ms FULL APPLY ~1367ms (not the ~600ms guessed) Re-sequenced §4 phases by measured cost (biggest lever first): 1. incremental compute_path_lengths (per-record + renamed-subtree Δ; NOT a base+delta overlay) — full §5.5 junior-dev guide added 2. trigram delta 3. Arc-share the clone 4. ext+children delta 5. unify + re-tune interval 6. remove IDXDELTA dev helpers Adds the captured numbers as docs/architecture/baselines/incremental-index- 2026-06-26.json (the §8 regression reference) and marks the done Phase-0 items (build stamp, timing, WIN rig) in §11. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the per-apply O(total-records) compute_path_lengths BFS (623ms, the #1 cost in the measured baseline) with an O(changed) per-record update for normal USN poll batches. - compact.rs: PathChange{idx, subtree} + update_path_lengths_incremental + path_len_from_parent + shift_subtree_path_len (iterative DFS over the children CSR, propagating a directory-rename's length delta to the whole subtree, clamped to u16). - apply_create / apply_rename thread &mut Vec<PathChange>; create/file- rename push a single O(1) record, directory-rename pushes subtree:true. - rebuild.rs: rebuild children CSR FIRST (so the subtree walk sees current adjacency), then gate incremental-vs-full path update on a 50k batch threshold; cold loads (empty change set) still take the full BFS. - Oracle gate (compact_loader_path_oracle_tests.rs): the incremental path_len must be byte-identical to a from-scratch compute_path_lengths rebuild across a batch of dir-rename + create + file-rename. Passes. IDXDELTA-TIMING now reports paths_us for the incremental path so the WIN rig can confirm the 623ms -> ~0 win against the committed baseline. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Incremental compute_path_lengths landed (9806bc3); path-len oracle gate is green. Phase 2 (trigram delta) is next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add the mutable delta-overlay type that the trigram / ext / children
base+delta search will read through (incremental-index-maintenance
§5.1). compact/delta.rs:
- IndexDelta { trigram, ext, children: FxHashMap<_, Vec<u32>>,
tombstones: FxHashSet<u32>, touched_records }.
- add_record (sorted+deduped binary-search insert across all three
posting maps; root u32::MAX parent adds no child posting),
tombstone (idempotent), is_tombstoned, len/is_empty (compaction
trigger), and the per-key postings accessors.
- The sorted/deduped posting invariant is what makes the eventual
base∪delta merge a linear pass.
Unit-tested (sorted/dedup insert, root sentinel, idempotent tombstone,
rename-as-two-touches). The base∪delta sorted-merge primitive itself
lands in the Phase-2 commit wired directly into trigram_search, so it
is never dead scaffolding. No DriveCompactIndex field yet — that is
added in Phase 2 where each of the ~20 construction sites is touched
once, with the change that gives the field meaning.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…se 2 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The first WIN run exposed a 0.5 s-per-apply regression on small batches: two applies (8 and 1 changes, loop_us=0) hit paths_us≈507 ms while every create/rename batch was 1-18 µs. Those were delete-only batches — a delete tombstones its record and pushes no PathChange, so path_changes is empty, and the gate wrongly fell back to the full O(total) compute_path_ lengths BFS. apply_usn_patch is never the cold-load path (build_compact_index does the cold BFS directly), so an empty path_changes during apply means "no record's path_len changed" → the correct work is none. A delete never shifts any surviving record's path_len. Drop the is_empty() arm; the only apply-time full-recompute fallback is now a >50k pathological batch. update_path_lengths_incremental is already a no-op over an empty slice. Oracle: add delete_only_batch_leaves_path_lengths_correct_without_full_ recompute. The shared assert now compares LIVE records only — a tombstoned record's path_len is meaningless (incremental leaves it stale, a full BFS recomputes it as a root); that divergence is correct and excluded. Expected effect: mean paths drops from 145 ms to sub-ms; full_apply ~800 ms -> ~640 ms (trigram ~390 ms now dominant -> Phase 2). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…s HEAD Kills the stale-binary trap for good. The rig now, before anything else: - BIN SYNC: resolves the release dir cargo *actually* uses via `cargo metadata` target_directory (honours CARGO_TARGET_DIR / .cargo/*.toml build.target-dir; override with UFFS_RELEASE_DIR), then copies uffs/uffsd (+ uffs-broker/uffsmcp if built) into ~/bin, printing each binary's build mtime. Required bin missing → bail "build first". - BUILD-ID MATCH GUARD: build-confirmation now extracts git="<sha>" from the IDXDELTA marker and asserts it equals `git rev-parse --short HEAD`; a resident daemon from an older build → hard fail with the fix. So the WIN loop is just: build → run. No manual `copy C:\rust-target\... \release\* ~/bin`. The target_directory JSON parse is a focused hand-scan (no serde) that unescapes Windows `C:\\..` paths; unit-checked locally. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…mbing) Routes every trigram caller through one DriveCompactIndex::trigram_search that reads the base ∪ delta overlay. No behavior change yet: the delta is always None until apply populates it (Phase 2b), so trigram_search is a zero-overhead delegate to the base TrigramIndex::search. - DriveCompactIndex gains `delta: Option<IndexDelta>` (None on fresh / compacted / cache-loaded; never serialized). All ~20 construction sites updated to `delta: None`. - trigram_search: when a delta is present, merge per needle-trigram the base posting with the delta posting (delta::merge_postings), intersect (trigram::intersect_in_place, now pub(crate)), then resolve tombstones on the FINAL candidate set — keeping a tombstoned record only if it was re-added under a name covering every needle trigram. This is what lets a renamed file appear under its new name yet vanish from its old one; filtering per posting list would wrongly hide the re-added record. - trigram.rs: extract the shared needle->trigram packing into needle_trigrams(); expose get_posting + intersect_in_place as pub(crate). - delta.rs: merge_postings sorted-union (no tombstone — see above). - Migrate the 3 trigram callers (tree, prefix_search, query) to trigram_search; each previously passed drive.fold, so behavior-identical. Tests (compact_trigram_delta_tests.rs) pin the overlay semantics with a manually-populated delta: create-visible, rename-visible-under-new-name + gone-from-old, delete-invisible, short-needle None. 867/867 green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…dules Pure, behavior-preserving split of the oversized compact.rs into cohesive compact/ submodules. Every public item is re-exported so the canonical `crate::compact::X` paths used across the workspace are unchanged — no call site outside the module moved. compact/record.rs CompactRecord + NTFS metafile-name allowlist (189) compact/children.rs ChildrenIndex (CSR parent→children) (111) compact/extension.rs ExtensionIndex (CSR ext_id→records) (102) compact/path_len.rs compute_path_lengths + Phase-1 incremental fns (214) compact/builder.rs build_compact_index + ADS/links/shrink/upcase (422) compact.rs DriveCompactIndex + HeapReport + impl + re-exports (385) compact.rs drops off the file-size exception list (was "13 over"; now 385, well under 800). 867/867 uffs-core tests pass unchanged (identical count pre/post — proves a pure move); clippy -D warnings, rustdoc -D warnings, lint-ci-windows all clean. This also tidies the tree for Phase 2b: the compact() method + apply delta-population drop cleanly into builder.rs / a slim compact.rs rather than a 1363-line file. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… rebuild
The 338 ms per-apply trigram rebuild is gone. apply_usn_patch now overlays
each batch onto the IndexDelta instead of rebuilding the base:
- DriveCompactIndex::apply_trigram_delta(adds, tombstones): adds each
created/renamed record's new-name trigrams to the delta and masks the
deleted/renamed-away/reused-slot records via tombstones. Folds back to a
fresh base (compact_base) only when the delta crosses
TRIGRAM_COMPACT_THRESHOLD (50k touched records) — so trigram_us is ~0 on
normal applies, a one-off full build on the occasional compaction tick.
- compact_loader/apply.rs: the per-change mutation cluster (StagedCreate,
stage_create, overwrite_slot, apply_{delete,create,rename}) extracted to
a submodule; each apply fn now also collects the trigram tombstone set
(deletes, renames, FRS-reuse overwrites). path_changes doubles as the
trigram-ADD set. compact_loader.rs 826 -> 592 LOC.
- rebuild.rs: replace the TrigramIndex::build call with apply_trigram_delta;
IDXDELTA-TIMING gains a `compacted` flag.
End-to-end oracle (compact_loader_trigram_oracle_tests.rs): a real
apply_usn_patch batch (create + rename + delete), then assert trigram_search
through base + delta returns IDENTICAL candidates to a compacted rebuild —
across created, renamed (new + old name), deleted, and untouched files.
868/868 green; clippy/rustdoc/file-size all clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per the request to stress beyond 1k files: - BURSTS 10/100/1000 → 1000/10000/100000. The 100k burst crosses TRIGRAM_COMPACT_THRESHOLD (50k) so it also exercises a delta compaction (full trigram refold) under load; the smaller bursts measure steady-state delta-overlay apply (trigram_us ≈ 0). - Replace the fixed-sleep freshness probe with poll_until_visible: polls a per-round filename prefix until that burst's `count` is search-visible (or a size-scaled budget elapses), so the report shows true creation throughput (files/s) AND apply-to-searchable latency, and flags an apply backlog instead of silently measuring the settle constant. Also marks Phases 1/2a/2b + the compact.rs decomposition done in the design-doc tracking table. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…n threshold A burst larger than TRIGRAM_COMPACT_THRESHOLD (e.g. the verify rig's 100k create) would populate the delta with 100k postings only to discard them at the post-population compaction check — pure wasted work. apply_trigram_delta now checks `pending_delta + batch_size > threshold` up front and, if so, refolds the base directly via compact_base (the records already reflect every change in the batch). This also catches the accumulation case where a small batch tips an already-large delta over the line. Reduces to compact_base (oracle-proven equivalent to a full rebuild), so the end-of-fn compaction branch is now unreachable and removed. Trigram + path oracles green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ounter
- Bin sync: a running uffs-broker (LocalSystem service) holds its exe open,
so the best-effort optional copy hit os error 32 and aborted the whole run
before any measurement. Optional-bin copy failures now warn ("skip …") and
continue; only uffs + uffsd (the rig's actual dependencies) hard-fail.
- Remove the now-unused `total_created` accumulator (each burst polls its own
per-round count) that tripped unused_variables/unused_assignments.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The stale-daemon guard compared the daemon's git SHA to HEAD verbatim, so a HEAD that advanced purely through scripts/ or docs/ (e.g. a verify-rig tweak) falsely flagged a current binary as stale and aborted the run. The guard now diffs the daemon SHA against HEAD and only bails when a build-affecting path changed (crates/**, Cargo.toml, Cargo.lock, rust-toolchain*); a non-source advance prints "binary is current" and proceeds. Fail-safe: assumes stale if git can't answer. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The old mutate round searched `idx_0_1` to check a deleted file — but that substring matches 111 bulk `idx_0_1*` files, so "expect 0" was a false signal (the live run showed 111, which was correct-by-accident). Replace it with `idxmutate_*` sentinels that share no trigram with the bulk files, and poll-until-applied (visible / absent) instead of a fixed sleep: - rename idxmutate_src → idxmutate_renamed: expect 'idxmutate_renamed' >= 1 - delete idxmutate_del: expect 'idxmutate_del' → 0 - old name idxmutate_src → 0 (renamed away) Each now gives a clean pass/fail with the real apply latency. Drops the now-unused SETTLE constant (every probe polls). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The daemon clones the whole DriveCompactIndex before each apply (lock-free COW snapshot for readers). That deep-copied the immutable inverted indexes — the trigram CSR alone is ~hundreds of MB on a multi-million-record drive. Make trigram / children / ext_index `Arc<…>`. The apply path never mutates them in place (it overlays on the delta and only ever *replaces* the whole index at compaction/rebuild), so Arc + replace-the-pointer is a perfect fit: the per-apply clone now pointer-clones these bases (a refcount bump) and only deep-copies records + names + the small delta. Read sites are unchanged — Arc derefs transparently through `.search()` / `.get_posting()` / `&drive.children`. - compact.rs: field types → Arc; compact_base wraps the refold in Arc::new. - rebuild.rs: the per-apply children/ext rebuilds wrap in Arc::new. - builder.rs / compact_cache.rs / fixtures: construction sites wrap in Arc::new. - New code uses alloc::sync::Arc (workspace lint convention); each touched file keeps its existing Arc import. Expect `clone_us` to drop materially on the WIN baseline (the CSR portion of the ~135ms clone). 868/868 uffs-core + 333/333 daemon green; clippy -D warnings, rustdoc, lint-prod, lint-ci-windows, file-size all clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rebuild) Drop the ~58 ms per-apply ExtensionIndex rebuild. `--ext` queries now read through DriveCompactIndex::records_with_ext (base ∪ delta): - records_with_ext(ext_id) -> Cow<[u32]>: zero-alloc borrow of the base CSR slice when delta is None; otherwise merges base + delta postings and validates each candidate against the live records (keep iff records[idx].extension_id == ext_id && name_len != 0). That records check is what makes a renamed extension (foo.log -> foo.pdf) and a delete correct WITHOUT an ext tombstone — a stale base posting just fails the check. - apply_trigram_delta renamed apply_index_delta; it now always adds the ext + children postings (only the trigram postings stay gated on name >= 3 chars), so a short-named create/rename is never missed by --ext. - compact_base refolds the ext base too; rebuild.rs drops the ext rebuild (ext_us now ~0 in IDXDELTA-TIMING). - Migrate the 3 ext readers (path_sorted / numeric / path_only top-N) to records_with_ext; the 3 post-apply ext unit tests assert through it. Oracle extended: records_with_ext through the overlay equals the compacted rebuild for every ext id, across create / rename / delete. 868/868 core + 333/333 daemon; clippy -D warnings, rustdoc, lint-prod, file-size all clean. Children stays full-rebuilt — Phase 4b moves it onto the overlay (higher care: it feeds FastPathResolver + the Phase-1 subtree walk). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ren rebuild) Drop the ~60 ms per-apply ChildrenIndex rebuild — the last of the four per-apply CSR rebuilds. Tree search + path resolution + the Phase-1 subtree walk now read child adjacency through the base ∪ delta overlay: - for_each_child(parent, visit): zero-alloc primitive — iterates the base CSR directly when delta is None; else sorted-merges base ∪ delta children (delta::merge_filter) and validates each against the live records (keep iff records[c].parent_idx == parent && live). The records check makes a moved- away/deleted child correct WITHOUT a children tombstone. - children_of(parent) -> Cow: the slice form (Cow::Borrowed, zero-alloc, when delta is None) for callers needing a list. - compact_base refolds the children base too; rebuild.rs drops the children rebuild (children_us now ~0). Apply path REORDERED: apply_index_delta runs BEFORE the path walk, so the Phase-1 directory-rename subtree walk sees this batch's new children (a child created inside a renamed dir in the same batch). shift_subtree_path_len is now two-pass (collect descendants over base ∪ delta, then shift path_len) to keep the records read (parent_idx filter) off the path_len write. Migrate the ~13 children readers (tree, query top-N, daemon info) to for_each_child / children_of; the post-apply children unit tests read through children_of. Oracles: children_of overlay == compacted rebuild for every parent across a file MOVE between directories + create + delete; and a directory rename WITH a same-batch child create yields byte-identical path_len to a full rebuild (the ordering guard). 870/870 core + 333/333 daemon; clippy/rustdoc/lint-prod/ lint-ci-windows/file-size all clean. This completes Phase 4: all four per-apply CSR rebuilds (paths, trigram, ext, children) are now incremental/overlay-served. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Measures the base ∪ delta overlay read cost directly — the Phase-4 concern: does children_of / records_with_ext regress tree search / --ext under churn? Each subject benched delta=None (compacted, Cow::Borrowed) vs churned (delta=Some populated by a real ~40k-create apply), plus for_each_child (zero-alloc) vs children_of (Cow). Baseline (~500k records, ~40k-change delta): children_of (one dir) 2.1 ns -> 631 ns churned for_each_child (churned) 354 ns records_with_ext (hot ~100k) 1.8 ns -> 295 us churned tree walk (2000 dirs/~500k) 667 us -> 2.09 ms churned Verdict: overhead is real but small in absolute terms — a whole-tree walk stays ~2 ms under peak churn, and the records_with_ext tax on a hot extension is dwarfed by downstream path-resolution of its ~100k results. So Phase 4's overlay does not need the zero-alloc fix now; for_each_child (~1.8x faster than children_of churned) is the ready lever if a future workload makes it bite. Committed as a durable regression guard for the overlay read cost. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… CPU-bounded) Now that an apply costs ~200 ms (Phases 1-4 made paths/trigram/ext/children incremental), retire the 30 s fixed rate-limit on the per-shard apply trigger and replace it with the file-watcher pattern: debounce + coalesce + max-wait. - ApplyTrigger: was a single rate-limit (apply at most once per 30 s). Now two knobs — DEFAULT_APPLY_DEBOUNCE_MS (250 ms settle) and DEFAULT_APPLY_INTERVAL_MS (2 s max-wait). It tracks first_change_at (max-wait clock) + last_change_at (settle clock); evaluate fires when the burst SETTLED (quiet >= debounce, the snappy idle->active path) OR the run aged past max_wait (the CPU cap under sustained churn). record() drops its event-count arg (timing only). - process_tick: evaluate the cadences EVERY poll, not only change-ticks — the settle apply and the age-based save must be able to fire on the first QUIET tick after a burst ends. Both evaluations are cheap no-ops when idle. - JournalLoopConfig gains apply_debounce + UFFS_USN_APPLY_DEBOUNCE_MS override. Regimes: idle -> 0 work; one saved file -> searchable in <1 s (debounce); a finished unzip -> one coalesced apply; the live C: (constant churn) -> capped at one ~200 ms apply / 2 s (~10% of a core) instead of thrashing. Decomposition: the Phase-5 additions tipped journal_loop.rs to 810 LOC, so the poll scheduling + backoff cluster (wait_for_next_tick, PollBackoff, poll_blocking, log_poll_failure, MAX_POLL_BACKOFF) is extracted to a cohesive journal_loop/poll.rs (full docs preserved); journal_loop.rs 810 -> 651, poll.rs 178, both under the 800 policy with no exception. idx-delta-verify rig pins the new knobs (max-wait 2 s, debounce 250 ms). The 5 ApplyTrigger unit tests rewritten for the settle/cap semantics (+ a max-wait-cap test); all journal_loop config fixtures carry the new field. 334/334 daemon; clippy/rustdoc/file-size clean. Design-doc tracking marks Phases 1-5 done. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Strip the temporary IDXDELTA dev scaffolding now that the incremental- index-maintenance algorithm (Phases 1-5) is WIN-validated, keeping the two pieces worth keeping as permanent facilities. * Build stamp: fold the git SHA into the existing `uffsd starting` banner (build.rs de-branded, kept) instead of a separate IDXDELTA marker. * Per-apply timing: replace the per-step IDXDELTA-TIMING lines (and the whole-body clone marker) with a single `usn apply: batch applied` DEBUG summary in compact_loader/rebuild.rs (changes/created/deleted/renamed/ skipped/records/ext_index_entries/compacted/apply_us). * Perf guard: add crates/uffs-core/benches/apply_cost.rs, a cross-platform Criterion bench timing the apply alone (creates/256, creates/4000, mixed/4000, deletes/4000) against a ~500k-record fixture. Pairs with the existing overlay_read.rs to lock in O(changed) per-apply cost. * Retarget scripts/windows/idx-delta-verify.rs onto the graduated logs. * Mark the design doc Phases 1-6 complete. WIN-validated on a live 4.3M-record drive under heavy churn (recycle-bin purge + bursts, 260k changes coalesced into 22 applies): children/ext overlays stayed 0 and paths under 0.3ms on every apply; the full O(total) trigram refold fired only on the 3 applies that crossed the 50k threshold. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Resolve every broken intra-doc link across the 7 crates (142 in total),
and fix the gate that let them rot.
Root cause: the `rustdoc` gate ran `cargo doc` WITHOUT
`--document-private-items`, so it only validated links reachable from the
public API surface. A broken `[`crate::path::pub_crate_item`]` (or a `//!`
shortcut to a private sibling) silently rendered as dead text instead of
failing the build.
Link fixes are real targets where one exists (fully-qualified
`crate::path::Item` / cross-crate paths, type-canonical method paths,
struct re-export paths). Where the target genuinely is not a doc-build
link, the reference becomes an honest code span instead:
* `#[cfg(windows)]` items absent on the macOS/Linux doc host
* `#[cfg(test)]` test-only fakes and modules
* dev-dependency types (e.g. `tempfile::TempDir`) and env-var strings
* struct fields, lint names, and fully-private cross-module fns
Gate, fixed at every layer so it cannot regress:
* `just rustdoc` recipe gains `--document-private-items`, and is wired
into the phase1 validation workflow (it was defined but never run).
* `scripts/ci/gates.toml` manifest updated; pre-push hook regenerated.
* `.github/workflows/pr-fast.yml` docs job updated.
* `go` / `ship` ci-pipeline (`phases.rs` + `ship.rs`) updated.
All three drift detectors (gates / hooks / workflow) pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
3fad5c4 to
d982737
Compare
This was referenced Jun 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Incremental index maintenance — base + delta overlay (LSM-style)
Makes the per-poll USN journal apply scale with the number of changed
records instead of the total drive size. On a live 4.3M-record drive the
per-apply cost drops from 1367 ms → ~200 ms (−85%), and the derived
indexes (trigram / extension / children / path-lengths) update in O(changed)
with the O(total) work confined to a rare compaction.
How it works
Immutable base CSR indexes are Arc-shared; a small mutable
IndexDeltaoverlay (plus a tombstone set) absorbs each batch. Search reads base ∪ delta
minus tombstones; once the delta crosses a 50k threshold a compaction folds
it back into fresh bases.
Phase progression (all WIN-validated)
compute_path_lengthsLatest WIN validation (under a recycle-bin purge storm)
4.3M-record drive, 260k changes coalesced into 22 applies:
children/extoverlays stayed 0 on every apply;paths< 0.3 ms.the 50k threshold.
rename visible in 0.0 s, delete leaves cleanly.
Phase 6 + doc hygiene (this PR's tail)
IDXDELTAdev scaffolding; graduated the git build-stamp intothe
uffsd startingbanner and the per-apply timing into a singleusn apply: batch appliedDEBUG summary.crates/uffs-core/benches/apply_cost.rs— the committed cross-platformper-apply perf guard.
cause: the rustdoc gate ran without
--document-private-items, so it nevervalidated the private surface. The gate now carries the flag at every layer
(justfile, gates.toml + regenerated hook, pr-fast.yml, go/ship pipeline).
Design + tracking:
docs/architecture/incremental-index-maintenance.md.🤖 Generated with Claude Code