Skip to content

research(nightly): late-interaction-maxsim — ColBERT-style MaxSim in Rust#550

Draft
ruvnet wants to merge 3 commits into
mainfrom
research/nightly/2026-06-10-late-interaction-maxsim
Draft

research(nightly): late-interaction-maxsim — ColBERT-style MaxSim in Rust#550
ruvnet wants to merge 3 commits into
mainfrom
research/nightly/2026-06-10-late-interaction-maxsim

Conversation

@ruvnet

@ruvnet ruvnet commented Jun 10, 2026

Copy link
Copy Markdown
Owner

Summary

Adds nightly RuVector research for late-interaction-maxsim: the first Rust-native, trait-based ColBERT-style late interaction (MaxSim) retrieval engine.

What ships:

  • Working Rust PoC (crates/ruvector-late-interaction) — 3 variants, 20/20 tests green, release build passes
  • ADR-199 (docs/adr/ADR-199-late-interaction-maxsim.md)
  • Research document (docs/research/nightly/2026-06-10-late-interaction-maxsim/README.md)
  • SEO gist (docs/research/nightly/2026-06-10-late-interaction-maxsim/gist.md)
  • Real benchmark results (no aspirational numbers)

Crate: ruvector-late-interaction

Three variants of a common MaxSimIndex trait:

Variant Strategy Mean lat. QPS Mem (KB) Recall@10
BruteForceIndex Exact O(N·T_d·T_q·D) scan 13,494 µs 74 8,000 1.000 (GT)
CompressedIndex SQ8 i8 tokens, int8 dot products 9,791 µs 102 2,000 0.792
PlaidLiteIndex k-means centroid pre-filter + MaxSim 15,262 µs 66 8,016 0.998

Hardware: x86-64 Linux 6.18.5, Intel Celeron N4020, Rust 1.94.1 release.
Dataset: N=2,000 docs × 16 tokens × D=64 dims; 50 queries × 8 tokens.

cargo build --release -p ruvector-late-interaction
cargo test -p ruvector-late-interaction   # 20/20 pass
cargo run --release -p ruvector-late-interaction --bin benchmark

Why late interaction matters for RuVector (2026)

  • Qdrant v1.15+ ships multivector natively. ECIR 2026 hosted a dedicated Late Interaction Retrieval workshop. PyLate (arXiv:2508.03555) provides the training ecosystem.
  • No Rust-native open-source MaxSim engine existed before this crate.
  • Agent working memory consists of multi-turn utterances decomposable into token embeddings — MaxSim gives token-level recall that single-vector HNSW cannot.
  • CompressedIndex (2 MB for 2K×16×64 corpus) targets WASM and Cognitum Seed edge deployments.

Ecosystem fit

  • Connects to: ruvector-core (HNSW), ruvector-diskann (centroid lookup future), ruvector-rairs (IVF companion), ruvector-verified (proof-gated writes future), ruvector-coherence (MaxSim recall as coherence probe), ruFlo (memory loop), MCP tools
  • Next nightly recommendation: streaming live HNSW repair after deletes (runner-up, score 4.25)

Research doc

docs/research/nightly/2026-06-10-late-interaction-maxsim/README.md

ADR

docs/adr/ADR-199-late-interaction-maxsim.md

Public gist

docs/research/nightly/2026-06-10-late-interaction-maxsim/gist.md
(ready to publish: gh gist create --public gist.md)


Generated by Claude Code

claude added 3 commits June 10, 2026 07:23
…teraction)

Three variants of a common MaxSimIndex trait:
- BruteForceIndex: exact O(N·Td·Tq·D) scan (ground truth baseline)
- PlaidLiteIndex: k-means centroid pre-filter + exact MaxSim on shortlist
- CompressedIndex: SQ8 i8 quantized tokens, 4× memory reduction

Real benchmark (N=2000, D=64, T=16, Q=50):
- brute-force: 13494 µs mean, 74 QPS, recall=1.000 (GT)
- compressed:  9791 µs mean, 102 QPS, recall=0.792, 2000 KB (4× smaller)
- plaid-lite:  15262 µs mean, 66 QPS, recall=0.998, 8016 KB

20/20 unit tests pass. Both acceptance criteria pass.
Adds crate to workspace. No external service dependencies.
Documents the decision to add ColBERT-style MaxSim retrieval to RuVector.
Covers alternatives (BM25 hybrid, full ColBERTv2), failure modes, security
considerations, and migration path. References measured benchmark evidence.
Research doc covers:
- 2026 SOTA survey (ColBERT, PLAID, ColBERT-Att, PyLate, LIR workshop)
- 10-20 year thesis on MaxSim as a cognitive primitive
- Real benchmark results captured from cargo run --release
- Memory math, practical failure modes, security implications
- WASM/edge/MCP/ruFlo integration roadmap
- 8 practical + 8 exotic applications

Gist is SEO-optimised for: ruvector, Rust vector database, ColBERT,
late interaction retrieval, MaxSim, multi-vector search, agent memory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants