perf(rvf,rvm): HNSW query path, RaBitQ, contiguous slab, witness v2, mincut wiring + security hardening by ruvnet · Pull Request #555 · ruvnet/RuVector

ruvnet · 2026-06-12T02:37:56Z

RVF (RuVector Format) and RVM optimization pass: deep review → SOTA research → implementation, with adversarial security audit and honest-claims corrections. 11 commits. The four library crates changed here are already published (rvf-types 0.2.1, rvf-index 0.2.0, rvf-quant 0.2.0, rvf-runtime 0.3.0).

RVF

HNSW wired into the runtime query path — RvfStore::query was a brute-force O(N·dim) scan; the entire rvf-index crate was unused by production queries and QualityEnvelope.evidence fabricated layer_a: true. Now: lazy build, incremental ingest, INDEX_SEG persistence (versioned backward-readable trailer), validate-or-rebuild on open, honest evidence. 21.7 ms → 1.51 ms per query at 100k×64 (14.4×), recall@10 0.968.
RaBitQ binary quantization (opt-in) — centroid centering, seeded randomized-Hadamard rotation, 1-bit codes + correction scalars, asymmetric estimator, two-stage oversample/rescore. Recall@10 0.972 at exactly 32× code compression. Plus Vamana α-pruning in HNSW neighbor selection (recall 0.986 → 0.996 at ef=30).
Contiguous vector slab — HashMap<u64, Vec<f32>> → flat row-major slab with zero-copy slice accessors. Brute scan 24.5 ms → 3.8 ms (6.4×), cold open −21.5%.
Security hardening (from an adversarial audit): crafted-file DoS fixes in decode_payload/decode_index_seg/decode_sketch_seg/decode_product (validate-before-allocate, checked arithmetic incl. wasm32), non-blocking index rebuilds (no more O(N log N) build under the query mutex), +8 adversarial regression tests. Audit found no crypto or memory-safety breaks.
Earlier fixes: deterministic (distance, id) tie-breaking, crc32fast (~100× checksums), manifest tail-scan widening, quant codec round-trip, unified content-hash implementation.

RVM

Witness v2 — 96-byte records with 128-bit keyed-BLAKE3 chain links (was a folded 32-bit hash), one compression per append (~112 ns measured), Merkle segment sealing with signed roots + inclusion proofs (CT/QMDB-style). v1 logs still verify; the v1 head is cryptographically anchored into the first v2 record.
Mincut wired into split decisions — execute_split now applies the computed cut boundary (was: empty child, cut ignored); Fennel O(deg) hot-path placement + exact Stoer-Wagner as a pressure-triggered epoch task; pressure+conductance split policy.
Honesty corrections: the "<10 µs partition switch" claim was a no-op bench (~6 ns) reported as "1600× faster than target" — now marked unimplemented with a canary test and corrected README; witness chain previously hashed only (prev_hash, sequence), now binds record content.

Verification

RVF: 1,271 passing on Windows (+38 net), 6 pre-existing env failures unchanged (verified identical on stashed baseline)
RVM: 875 passing (+75), zero failures
READMEs updated with environment-labeled measured numbers only

🤖 Generated with claude-flow

https://claude.ai/code/session_01C83hbozEXPgoz9iJN5Smhp

Equal-distance vectors were selected and ordered by HashMap iteration order, which changes across process restarts and made query results non-reproducible (flaky smoke_rvlite_adapter_persistence). Break ties by vector id in both the top-k heap eviction and the final sort, in query() and query_with_envelope(). https://claude.ai/code/session_01C83hbozEXPgoz9iJN5Smhp

…fest discovery rvf-index: - Cache SIMD distance-kernel dispatch in a OnceLock function-pointer table instead of re-running is_x86_feature_detected! on every call - Rewrite HNSW search_layer with BinaryHeap min/max-heaps (was sorted Vec + O(n) mid-inserts) and a dense Vec<bool> visited bitmap (was a per-call SipHash HashSet); deterministic (distance, id) tie-breaking rvf-runtime: - Replace per-bit CRC32 loops with crc32fast (same IEEE polynomial, byte-identical hashes, ~100x faster) on segment write and verify - Hoist cosine query-norm computation out of the per-vector scan loop - Safety-net scan: single pass with HashSet membership (was O(k*N*neighbors) with Vec::contains) - Bulk little-endian f32 serialization in write_vec_seg (one memcpy per vector instead of per-element appends) - Progressively widen the manifest tail scan (64KB -> 1MB -> 16MB -> whole file): stores with large segment directories were becoming unreadable once the latest manifest fell outside the fixed 64KB window; with regression test rvf-quant: - encode_quant_seg now emits fully decodable payloads (delegates to the real scalar/product encoders; placeholders removed) - decode_quant_seg returns Result instead of panicking on malformed or unknown-type payloads; round-trip and malformed-input tests https://claude.ai/code/session_01C83hbozEXPgoz9iJN5Smhp

…ap, sched hot paths rvm-witness (security-critical): - The chain hash covered only (prev_hash, sequence) — record content (action, actor, target, payload, timestamp) was never hashed, so verify_chain accepted arbitrarily rewritten history. record_hash is now computed over the 44 content bytes (as its doc always claimed) and the chain binds it: H(prev || seq || record_hash). verify_chain recomputes content hashes; tamper-regression tests added. - HMAC signer keys the Mac template once at construction instead of re-running the key schedule per record (fixed-vector test pins signature bytes) - Witness ring overflow is now observable: total_overwritten counter and needs_drain() accessor rvm-cap: - Nonce replay window: colliding nonces (A + k*4096) could evict and re-admit nonce A. Replaced the two 32KB arrays with one 32KB open-addressed table (8-probe bounded); eviction raises the watermark so it fails closed. Regression test included. rvm-coherence: - internal_weight: O(MAX_EDGES) self-loop scan replaced with O(1) adj_matrix[i][i] read (invariant verified across all mutation paths) - Skip ticks return a cached CoherenceDecision instead of re-running the O(n^2) merge-pair pass over stale data; zero-weight pairs skipped - Mincut: scratch buffers moved into the long-lived bridge (~17KB less stack per call), in-place Stoer-Wagner (no working copy), bitmask membership, column-scan in-neighbors - Compile-time guard: CoherenceGraph MAX_NODES > ADJ_DIM now fails to compile instead of panicking at the 33rd node; u64 weight deltas clamped at the engine boundary rvm-coherence/rvm-partition: - Single-slot hash indexes (id_to_node, edge_index) degraded to permanent O(N) scans after any collision; both now use bounded linear probing with tombstones and probe-proven absence rvm-sched: - enqueue() rejects the HYPERVISOR sentinel id, which previously wedged a run-queue slot permanently; defensive cleanup in switch_next Tests: 733 workspace + 67 rvm-kernel lib pass (baseline 712); 23 new tests including tamper-evidence and collision regressions. https://claude.ai/code/session_01C83hbozEXPgoz9iJN5Smhp

RvfStore::query was a brute-force O(N*dim) scan; the rvf-index crate was unused by production queries and QualityEnvelope.evidence fabricated layer_a=true. The index is now built lazily on first eligible query, maintained incrementally on ingest, persisted on close() via the existing INDEX_SEG codec (with a versioned, backward-readable trailer for the sparse-id mapping), and validated-or-rebuilt on open. Exact scan remains for small stores (<1024), filtered/COW/membership queries, >25% deleted, and force_exact; deterministic (distance, id) tie-breaking preserved on both paths. evidence.layer_a is now set only when the index served the query. Measured: 21.7ms -> 1.51ms per query at 100k x 64-dim (criterion, release), recall@10 = 0.968 at the ef_search=256 floor (>=0.95 gated by test). +15 tests (recall, index persistence round-trip, evidence honesty, fallback routing, compaction/overwrite invalidation). Co-Authored-By: claude-flow <ruv@ruv.net>

…e sealing v1 records folded chain links to 32 bits and left the head unanchorable. The 96-byte v2 record embeds the predecessor MAC full-width and chains via one keyed-BLAKE3 compression per append (~112ns measured, 9x under the 1us target); keyed MACs detect last-record tampering and unkeyed forgery, which v1 could not. Segment sealing accumulates record MACs into a domain-separated Merkle tree (256/segment) sealed with one signature via the existing signer infra (HMAC/dual-HMAC/Ed25519/TEE), with inclusion proofs — expensive crypto moves off the per-record path and roots are externally anchorable. v1 logs still verify (version-byte dispatch; v1 only as prefix, head anchored into the first v2 record); v1 writing is frozen. blake3 added as pure-Rust no_std. +46 tests covering content/reorder/truncation/wrong-key /forgery tamper modes, v1 compat, proofs, seals, and mixed logs. Co-Authored-By: claude-flow <ruv@ruv.net>

…claim execute_split previously created an empty child and ignored the computed cut. It now resolves the boundary from a cached epoch SplitPlan (or computes on demand) and re-homes move-side neighbors to the child with their edge weights. Two-tier decisions: exact Stoer-Wagner mincut runs as a pressure-triggered epoch task; a new Fennel placer (O(degree), fixed-point gamma=1.5, no_std) handles hot-path placement. Split policy combines pressure and cut quality: mid-band (8000-9500bp) splits only on a cut with conductance <= 5000bp; critical pressure stays an unconditional safety valve. The sub-10us partition-switch claim was a stub certified by a no-op bench (~6ns) reported as 1600x faster than target. The real path needs EL2 assembly the crate forbids; instead the measurable register save/restore lower bound is implemented and benchmarked, the bench is renamed partition_switch_validation_stub with an honesty gate, a canary test fails if HARDWARE_SWITCH_IMPLEMENTED flips without revisiting the claim, and the README row now reads: not validated. +30 tests. Co-Authored-By: claude-flow <ruv@ruv.net>

rvf-quant gains a RaBitQ-style codec: global-centroid centering, 3-round seeded randomized-Hadamard rotation (orthonormal, reproducible from a stored u64 seed), 1-bit sign codes with per-vector norm/dot-correction scalars, and an asymmetric full-precision-query estimator. QUANT_SEG adds versioned type tag 4 (legacy payloads byte-frozen and still decode; unknown versions rejected; decode stays panic-free on untrusted bytes). Query path: opt-in two-stage search (QueryOptions::rabitq, default off) — estimator scan with oversampling (640-candidate floor) then exact f32 rescore; deterministic (distance, id) tie-breaking; falls back to default routing for filtered/COW/IP/cosine queries. Measured recall@10 = 0.972 vs exact on 10k x 128 (gate >= 0.95, test-enforced); code-only compression exactly 32x. rvf-index: Vamana-style robust prune (alpha = 1.2, occluded backfill) at insert and prune time; recall@10 at ef=30 improved 0.986 -> 0.996; construction determinism preserved. +42 tests (1254 passing, no new failures). Co-Authored-By: claude-flow <ruv@ruv.net>

An adversarial audit confirmed a crafted .rvf could panic or OOM the process on RvfStore::open(): unvalidated length fields drove Vec::with_capacity before any byte-availability check. decode_payload now bounds id_count by available delta bytes (u64 compare before the usize cast, so 32-bit truncation cannot bypass it); decode_index_seg bounds restart_count/layer_count/neighbor_count by remaining bytes and rejects truncated restart padding (was a reachable slice panic); decode_sketch_seg converts from assert-and-panic to Result with width/depth validated via checked_mul (closes the width=0 + depth=u32::MAX bypass); decode_product size products use checked u64 arithmetic so 32-bit (wasm32) targets cannot wrap usize and read out of bounds. +8 adversarial regression tests. Co-Authored-By: claude-flow <ruv@ruv.net>

…d hashing Vector storage moves from HashMap<u64, Vec<f32>> to a contiguous row-major slab (id->ordinal map, tombstoned deletes, slot reuse only via compaction); HNSW/RaBitQ paths read rows as zero-copy slices and iteration is ordinal- ordered (deterministic across restarts). Brute-force query at 100k x 64: 24.5ms -> 3.8ms (~6.4x). boot() pre-sizes the slab and bulk-copies VEC_SEG payloads (no per-vector allocs): cold open 257ms -> 202ms (-21.5%). mmap deferred (CRC verify touches all bytes anyway; memmap2 not in this workspace) and documented as follow-up. Audit finding 5: index/RaBitQ lazy builds now run with no lock held behind an AtomicBool gate (panic-safe clear-on-drop); concurrent queries fall back to exact scan and keep serving through the entire O(N log N) build. Overwrite still invalidates and unlinks the stale INDEX_SEG. Hashing: the two identical bespoke CRC32-rotation implementations in write_path/read_path now delegate to one source of truth (hashing::legacy_content_hash); on-disk bytes unchanged. Full rvf-wire checksum-registry conformance (XXH3-128 + format-version bump + dual-accept reader) documented as the remaining delta. read_path.rs also carries the audit''s checked vec-seg size arithmetic. +11 tests; suite 1271 passing, no new failures (one pre-existing wall-clock bench assertion flakes under load, passes in isolation). Co-Authored-By: claude-flow <ruv@ruv.net>

…0.1.7, measured-benchmark READMEs - rvf-types 0.2.0 -> 0.2.1 (QuantType::RaBitQ format extension) - rvf-index 0.1.0 -> 0.2.0 (Vamana alpha-pruning, hardened INDEX_SEG codec) - rvf-quant 0.1.0 -> 0.2.0 (RaBitQ codec; decode_sketch_seg now returns Result) - rvf-runtime 0.2.0 -> 0.3.0 (HNSW query path, INDEX_SEG trailer, QueryOptions::rabitq, vector slab) - dependent path-dep version reqs updated (cli, import, launch, node, server) - @ruvector/rvf 0.2.0 -> 0.2.2, @ruvector/rvf-wasm 0.1.6 -> 0.1.7 (rebuilt wasm artifact, 1.89 toolchain + wasm-opt -Oz) - READMEs: HNSW/RaBitQ/slab docs with measured numbers (Windows x64, criterion release, 100k x 64-dim); rvm witness v2 bench rows Co-Authored-By: claude-flow <ruv@ruv.net>

The rvf-runtime 0.2 -> 0.3.0 version bump updated dependents inside the rvf workspace but missed the root-workspace consumer: ruvector-robotics pins version 0.2 alongside its path dep, which fails cargo resolution against the bumped crate (PR #555 CI: failed to select a version for the requirement rvf-runtime ^0.2). Root Cargo.lock refreshed. Co-Authored-By: claude-flow <ruv@ruv.net>

claude and others added 12 commits June 11, 2026 18:46

chore(rvf): sync Cargo.lock with rvf-wire deps (sha3, subtle)

4add67b

https://claude.ai/code/session_01C83hbozEXPgoz9iJN5Smhp

ruvnet merged commit 3e84297 into main Jun 12, 2026
72 of 75 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(rvf,rvm): HNSW query path, RaBitQ, contiguous slab, witness v2, mincut wiring + security hardening#555

perf(rvf,rvm): HNSW query path, RaBitQ, contiguous slab, witness v2, mincut wiring + security hardening#555
ruvnet merged 12 commits into
mainfrom
claude/compassionate-volta-5gbbqj

ruvnet commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ruvnet commented Jun 12, 2026

RVF

RVM

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants