perf(rvf,rvm): HNSW query path, RaBitQ, contiguous slab, witness v2, mincut wiring + security hardening#555
Merged
Merged
Conversation
Equal-distance vectors were selected and ordered by HashMap iteration order, which changes across process restarts and made query results non-reproducible (flaky smoke_rvlite_adapter_persistence). Break ties by vector id in both the top-k heap eviction and the final sort, in query() and query_with_envelope(). https://claude.ai/code/session_01C83hbozEXPgoz9iJN5Smhp
…fest discovery rvf-index: - Cache SIMD distance-kernel dispatch in a OnceLock function-pointer table instead of re-running is_x86_feature_detected! on every call - Rewrite HNSW search_layer with BinaryHeap min/max-heaps (was sorted Vec + O(n) mid-inserts) and a dense Vec<bool> visited bitmap (was a per-call SipHash HashSet); deterministic (distance, id) tie-breaking rvf-runtime: - Replace per-bit CRC32 loops with crc32fast (same IEEE polynomial, byte-identical hashes, ~100x faster) on segment write and verify - Hoist cosine query-norm computation out of the per-vector scan loop - Safety-net scan: single pass with HashSet membership (was O(k*N*neighbors) with Vec::contains) - Bulk little-endian f32 serialization in write_vec_seg (one memcpy per vector instead of per-element appends) - Progressively widen the manifest tail scan (64KB -> 1MB -> 16MB -> whole file): stores with large segment directories were becoming unreadable once the latest manifest fell outside the fixed 64KB window; with regression test rvf-quant: - encode_quant_seg now emits fully decodable payloads (delegates to the real scalar/product encoders; placeholders removed) - decode_quant_seg returns Result instead of panicking on malformed or unknown-type payloads; round-trip and malformed-input tests https://claude.ai/code/session_01C83hbozEXPgoz9iJN5Smhp
…ap, sched hot paths rvm-witness (security-critical): - The chain hash covered only (prev_hash, sequence) — record content (action, actor, target, payload, timestamp) was never hashed, so verify_chain accepted arbitrarily rewritten history. record_hash is now computed over the 44 content bytes (as its doc always claimed) and the chain binds it: H(prev || seq || record_hash). verify_chain recomputes content hashes; tamper-regression tests added. - HMAC signer keys the Mac template once at construction instead of re-running the key schedule per record (fixed-vector test pins signature bytes) - Witness ring overflow is now observable: total_overwritten counter and needs_drain() accessor rvm-cap: - Nonce replay window: colliding nonces (A + k*4096) could evict and re-admit nonce A. Replaced the two 32KB arrays with one 32KB open-addressed table (8-probe bounded); eviction raises the watermark so it fails closed. Regression test included. rvm-coherence: - internal_weight: O(MAX_EDGES) self-loop scan replaced with O(1) adj_matrix[i][i] read (invariant verified across all mutation paths) - Skip ticks return a cached CoherenceDecision instead of re-running the O(n^2) merge-pair pass over stale data; zero-weight pairs skipped - Mincut: scratch buffers moved into the long-lived bridge (~17KB less stack per call), in-place Stoer-Wagner (no working copy), bitmask membership, column-scan in-neighbors - Compile-time guard: CoherenceGraph MAX_NODES > ADJ_DIM now fails to compile instead of panicking at the 33rd node; u64 weight deltas clamped at the engine boundary rvm-coherence/rvm-partition: - Single-slot hash indexes (id_to_node, edge_index) degraded to permanent O(N) scans after any collision; both now use bounded linear probing with tombstones and probe-proven absence rvm-sched: - enqueue() rejects the HYPERVISOR sentinel id, which previously wedged a run-queue slot permanently; defensive cleanup in switch_next Tests: 733 workspace + 67 rvm-kernel lib pass (baseline 712); 23 new tests including tamper-evidence and collision regressions. https://claude.ai/code/session_01C83hbozEXPgoz9iJN5Smhp
RvfStore::query was a brute-force O(N*dim) scan; the rvf-index crate was unused by production queries and QualityEnvelope.evidence fabricated layer_a=true. The index is now built lazily on first eligible query, maintained incrementally on ingest, persisted on close() via the existing INDEX_SEG codec (with a versioned, backward-readable trailer for the sparse-id mapping), and validated-or-rebuilt on open. Exact scan remains for small stores (<1024), filtered/COW/membership queries, >25% deleted, and force_exact; deterministic (distance, id) tie-breaking preserved on both paths. evidence.layer_a is now set only when the index served the query. Measured: 21.7ms -> 1.51ms per query at 100k x 64-dim (criterion, release), recall@10 = 0.968 at the ef_search=256 floor (>=0.95 gated by test). +15 tests (recall, index persistence round-trip, evidence honesty, fallback routing, compaction/overwrite invalidation). Co-Authored-By: claude-flow <ruv@ruv.net>
…e sealing v1 records folded chain links to 32 bits and left the head unanchorable. The 96-byte v2 record embeds the predecessor MAC full-width and chains via one keyed-BLAKE3 compression per append (~112ns measured, 9x under the 1us target); keyed MACs detect last-record tampering and unkeyed forgery, which v1 could not. Segment sealing accumulates record MACs into a domain-separated Merkle tree (256/segment) sealed with one signature via the existing signer infra (HMAC/dual-HMAC/Ed25519/TEE), with inclusion proofs — expensive crypto moves off the per-record path and roots are externally anchorable. v1 logs still verify (version-byte dispatch; v1 only as prefix, head anchored into the first v2 record); v1 writing is frozen. blake3 added as pure-Rust no_std. +46 tests covering content/reorder/truncation/wrong-key /forgery tamper modes, v1 compat, proofs, seals, and mixed logs. Co-Authored-By: claude-flow <ruv@ruv.net>
…claim execute_split previously created an empty child and ignored the computed cut. It now resolves the boundary from a cached epoch SplitPlan (or computes on demand) and re-homes move-side neighbors to the child with their edge weights. Two-tier decisions: exact Stoer-Wagner mincut runs as a pressure-triggered epoch task; a new Fennel placer (O(degree), fixed-point gamma=1.5, no_std) handles hot-path placement. Split policy combines pressure and cut quality: mid-band (8000-9500bp) splits only on a cut with conductance <= 5000bp; critical pressure stays an unconditional safety valve. The sub-10us partition-switch claim was a stub certified by a no-op bench (~6ns) reported as 1600x faster than target. The real path needs EL2 assembly the crate forbids; instead the measurable register save/restore lower bound is implemented and benchmarked, the bench is renamed partition_switch_validation_stub with an honesty gate, a canary test fails if HARDWARE_SWITCH_IMPLEMENTED flips without revisiting the claim, and the README row now reads: not validated. +30 tests. Co-Authored-By: claude-flow <ruv@ruv.net>
rvf-quant gains a RaBitQ-style codec: global-centroid centering, 3-round seeded randomized-Hadamard rotation (orthonormal, reproducible from a stored u64 seed), 1-bit sign codes with per-vector norm/dot-correction scalars, and an asymmetric full-precision-query estimator. QUANT_SEG adds versioned type tag 4 (legacy payloads byte-frozen and still decode; unknown versions rejected; decode stays panic-free on untrusted bytes). Query path: opt-in two-stage search (QueryOptions::rabitq, default off) — estimator scan with oversampling (640-candidate floor) then exact f32 rescore; deterministic (distance, id) tie-breaking; falls back to default routing for filtered/COW/IP/cosine queries. Measured recall@10 = 0.972 vs exact on 10k x 128 (gate >= 0.95, test-enforced); code-only compression exactly 32x. rvf-index: Vamana-style robust prune (alpha = 1.2, occluded backfill) at insert and prune time; recall@10 at ef=30 improved 0.986 -> 0.996; construction determinism preserved. +42 tests (1254 passing, no new failures). Co-Authored-By: claude-flow <ruv@ruv.net>
An adversarial audit confirmed a crafted .rvf could panic or OOM the process on RvfStore::open(): unvalidated length fields drove Vec::with_capacity before any byte-availability check. decode_payload now bounds id_count by available delta bytes (u64 compare before the usize cast, so 32-bit truncation cannot bypass it); decode_index_seg bounds restart_count/layer_count/neighbor_count by remaining bytes and rejects truncated restart padding (was a reachable slice panic); decode_sketch_seg converts from assert-and-panic to Result with width/depth validated via checked_mul (closes the width=0 + depth=u32::MAX bypass); decode_product size products use checked u64 arithmetic so 32-bit (wasm32) targets cannot wrap usize and read out of bounds. +8 adversarial regression tests. Co-Authored-By: claude-flow <ruv@ruv.net>
…d hashing Vector storage moves from HashMap<u64, Vec<f32>> to a contiguous row-major slab (id->ordinal map, tombstoned deletes, slot reuse only via compaction); HNSW/RaBitQ paths read rows as zero-copy slices and iteration is ordinal- ordered (deterministic across restarts). Brute-force query at 100k x 64: 24.5ms -> 3.8ms (~6.4x). boot() pre-sizes the slab and bulk-copies VEC_SEG payloads (no per-vector allocs): cold open 257ms -> 202ms (-21.5%). mmap deferred (CRC verify touches all bytes anyway; memmap2 not in this workspace) and documented as follow-up. Audit finding 5: index/RaBitQ lazy builds now run with no lock held behind an AtomicBool gate (panic-safe clear-on-drop); concurrent queries fall back to exact scan and keep serving through the entire O(N log N) build. Overwrite still invalidates and unlinks the stale INDEX_SEG. Hashing: the two identical bespoke CRC32-rotation implementations in write_path/read_path now delegate to one source of truth (hashing::legacy_content_hash); on-disk bytes unchanged. Full rvf-wire checksum-registry conformance (XXH3-128 + format-version bump + dual-accept reader) documented as the remaining delta. read_path.rs also carries the audit''s checked vec-seg size arithmetic. +11 tests; suite 1271 passing, no new failures (one pre-existing wall-clock bench assertion flakes under load, passes in isolation). Co-Authored-By: claude-flow <ruv@ruv.net>
…0.1.7, measured-benchmark READMEs - rvf-types 0.2.0 -> 0.2.1 (QuantType::RaBitQ format extension) - rvf-index 0.1.0 -> 0.2.0 (Vamana alpha-pruning, hardened INDEX_SEG codec) - rvf-quant 0.1.0 -> 0.2.0 (RaBitQ codec; decode_sketch_seg now returns Result) - rvf-runtime 0.2.0 -> 0.3.0 (HNSW query path, INDEX_SEG trailer, QueryOptions::rabitq, vector slab) - dependent path-dep version reqs updated (cli, import, launch, node, server) - @ruvector/rvf 0.2.0 -> 0.2.2, @ruvector/rvf-wasm 0.1.6 -> 0.1.7 (rebuilt wasm artifact, 1.89 toolchain + wasm-opt -Oz) - READMEs: HNSW/RaBitQ/slab docs with measured numbers (Windows x64, criterion release, 100k x 64-dim); rvm witness v2 bench rows Co-Authored-By: claude-flow <ruv@ruv.net>
The rvf-runtime 0.2 -> 0.3.0 version bump updated dependents inside the rvf workspace but missed the root-workspace consumer: ruvector-robotics pins version 0.2 alongside its path dep, which fails cargo resolution against the bumped crate (PR #555 CI: failed to select a version for the requirement rvf-runtime ^0.2). Root Cargo.lock refreshed. Co-Authored-By: claude-flow <ruv@ruv.net>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
RVF (RuVector Format) and RVM optimization pass: deep review → SOTA research → implementation, with adversarial security audit and honest-claims corrections. 11 commits. The four library crates changed here are already published (rvf-types 0.2.1, rvf-index 0.2.0, rvf-quant 0.2.0, rvf-runtime 0.3.0).
RVF
RvfStore::querywas a brute-force O(N·dim) scan; the entirervf-indexcrate was unused by production queries andQualityEnvelope.evidencefabricatedlayer_a: true. Now: lazy build, incremental ingest, INDEX_SEG persistence (versioned backward-readable trailer), validate-or-rebuild on open, honest evidence. 21.7 ms → 1.51 ms per query at 100k×64 (14.4×), recall@10 0.968.HashMap<u64, Vec<f32>>→ flat row-major slab with zero-copy slice accessors. Brute scan 24.5 ms → 3.8 ms (6.4×), cold open −21.5%.decode_payload/decode_index_seg/decode_sketch_seg/decode_product(validate-before-allocate, checked arithmetic incl. wasm32), non-blocking index rebuilds (no more O(N log N) build under the query mutex), +8 adversarial regression tests. Audit found no crypto or memory-safety breaks.RVM
execute_splitnow applies the computed cut boundary (was: empty child, cut ignored); Fennel O(deg) hot-path placement + exact Stoer-Wagner as a pressure-triggered epoch task; pressure+conductance split policy.Verification
🤖 Generated with claude-flow