feat(index): support raw-query ivf rq search by BubbleCal · Pull Request #7078 · lance-format/lance

BubbleCal · 2026-06-03T16:12:52Z

Feature

Adds explicit IVF_RQ query_estimator metadata so released indexes without the field continue to read as residual_query, while newly written indexes use raw_query.
Implements raw-query IVF_RQ search for new num_bits == 1 indexes and multi-bit split-code indexes, including ex-code factors and runtime-only rotated centroid caches derived from the original IvfModel centroids.
Prepares the rotated raw query and split-code lookup tables once per query worker and reuses them across probed partitions; each partition updates only the cluster correction.
Relaxes the public IVF_RQ num_bits > 1 gate for supported metrics, including cosine via Lance's normalized-L2 handling.

Compatibility

Old IVF_RQ indexes that lack query_estimator metadata still default to the legacy residual-query estimator.
Original IVF centroids remain the source of truth for partition assignment, incremental indexing, and persisted metadata.

Performance Improvement

The benchmark below was run with search-benchmark on GCP VM yang-agent-00bd-ivfrq-rerun-20260605, dataset gist, k=10, max_threads=1, target_partition_size=4096, no refine. Latencies are converted from CSV seconds to milliseconds.

Provenance:

search-benchmark commit: 61ef8f7b97589032a83eeae1e52664be9f035551
main Lance baseline commit: 437849118f380d92c1ea849f99996e9072be58df
PR branch commit benchmarked: ce548a49766670b80275daae6f1bf97c70e885e4

Additional DBpedia comparison on the same VM, current branch only, dataset dbpedia, k=10, max_threads=1, target_partition_size=4096. For IVF_PQ, sub_vector_dim=8; one extra row includes refine_factor=2 at nprobes=24.

Index	Config	nprobes	refine	recall@10	avg ms	p99 ms	QPS	indexing s
IVF_RQ	num_bits=1	8	-	0.7917	1.59	1.98	615.8	16.45
IVF_RQ	num_bits=1	16	-	0.8102	2.35	2.98	420.3	16.45
IVF_RQ	num_bits=1	24	-	0.8162	3.19	3.93	311.4	16.45
IVF_RQ	num_bits=3	8	-	0.9014	2.14	2.63	463.8	27.01
IVF_RQ	num_bits=3	16	-	0.9263	2.93	3.58	338.9	27.01
IVF_RQ	num_bits=3	24	-	0.9352	3.82	4.74	261.0	27.01
IVF_RQ	num_bits=5	8	-	0.9207	2.32	2.80	426.2	33.93
IVF_RQ	num_bits=5	16	-	0.9520	3.32	4.05	300.1	33.93
IVF_RQ	num_bits=5	24	-	0.9624	4.56	5.57	218.1	33.93
IVF_RQ	num_bits=7	8	-	0.9278	2.84	3.39	350.3	46.76
IVF_RQ	num_bits=7	16	-	0.9572	3.77	4.45	264.1	46.76
IVF_RQ	num_bits=7	24	-	0.9683	4.96	5.94	200.7	46.76
IVF_PQ	sub_vector_dim=8	8	-	0.7354	4.44	5.50	223.7	153.84
IVF_PQ	sub_vector_dim=8	16	-	0.7447	8.05	9.68	123.6	153.84
IVF_PQ	sub_vector_dim=8	24	-	0.7483	12.80	14.72	78.0	153.84
IVF_PQ	sub_vector_dim=8	24	2	0.9133	12.84	14.96	77.7	153.84

Tests

cargo fmt --all
cargo test -p lance-index raw_query
cargo test -p lance-index try_from_batch_
cargo test -p lance-index rabit_quantizer
cargo test -p lance test_rabitq_distance_types
cargo test -p lance test_build_ivf_rq
cargo clippy -p lance-index -p lance --tests --benches -- -D warnings
uv run make build
targeted Python IVF_RQ multi-bit and cosine search test
targeted Python distributed IVF_RQ shared-rotation test

claude

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

_{Tip: disable this comment in your organization's Code Review settings.}

github-actions · 2026-06-03T16:13:08Z

Important

This PR touches the Lance format specification.

Substantive changes to the format specification — the .proto definitions
and the spec docs under docs/src/format/ — require a PMC vote before merge.
Minor edits such as typo fixes, wording, or formatting are excluded; use your
judgment.

If this is a meaningful format change:

Start a vote following the Lance community voting process.
Format specification modifications need 3 binding +1 votes (excluding the
proposer), held on GitHub Discussions, with a minimum voting period of 1 week.
Once the vote passes, link the completed vote in this PR. It should not be
merged until the vote is linked.

BubbleCal · 2026-06-08T08:50:00Z

@claude review

BubbleCal · 2026-06-08T08:50:05Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 42b122a38b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

claude

This is a substantial new feature touching the Lance format spec, the IVF_RQ storage layout, and the core distance-calculator path — I'm leaving it for a human reviewer.

Extended reasoning...

Overview

This PR introduces a new raw_query estimator for IVF_RQ search alongside the existing residual_query estimator, including multi-bit (num_bits > 1) split-code search. Changes span 13 files: a format-spec doc update, new metadata fields (query_estimator), substantial additions to bq/storage.rs (~1100 lines including new ex-code distance tables, FastScan paths, and lower-bound gating), a refactor of the flat index search loops to use new accumulate_topk_with_scratch trait methods, a fix to PERM0_INVERSE in the SIMD dist table (the prior value was incorrect — note the new test_perm0_inverse_matches_perm0 test), and IVF v2 plumbing for per-query rotated-query contexts and per-index rotated-centroid caches.

Security risks

No direct security risks — this is internal indexing code with no auth, crypto, or untrusted-input boundaries.

Level of scrutiny

High. This PR touches the format specification (per the format-change-vote reminder, this needs a PMC vote with 3 binding +1s and a 1-week voting period), modifies the storage layout for newly-built IVF_RQ indexes, and rewrites the search hot path with new lower-bound gating that affects recall/accuracy. The compatibility shim (default_query_estimator_compat → ResidualQuery) is the right approach for old indexes, but the correctness of the new raw-query estimator factors, error-factor gating, and FastScan ex-code path all warrant careful human review.

Other factors

The PR has good test coverage: new unit tests for the estimator factors, FastScan ex-code paths, lower-bound gating, cache slice borrowing, and a parameterized integration test for multi-bit L2/Cosine search.
A previously-incorrect PERM0_INVERSE constant is silently fixed here — worth a human verifying no callers depended on the buggy value.
The Cosine→L2 distance-type rewrite in try_from_batch for raw-query indexes is subtle and could surprise readers; worth confirming the rewrite is correct and intentional.
The format-spec change requires a PMC vote that has not yet been linked on this PR.

codecov · 2026-06-08T10:58:59Z

Codecov Report

❌ Patch coverage is 90.53785% with 190 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
rust/lance-index/src/vector/bq/storage.rs	90.55%	122 Missing and 7 partials ⚠️
rust/lance/src/index/vector/ivf/v2.rs	89.42%	13 Missing and 11 partials ⚠️
rust/lance-index/src/vector/bq/builder.rs	88.88%	15 Missing and 3 partials ⚠️
rust/lance-index/src/vector/bq/transform.rs	92.18%	7 Missing and 3 partials ⚠️
...lance-index/src/vector/distributed/index_merger.rs	78.57%	0 Missing and 3 partials ⚠️
rust/lance-index/src/vector/storage.rs	95.83%	3 Missing ⚠️
rust/lance/src/index/vector/ivf/partition_serde.rs	72.72%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

claude · 2026-06-08T12:25:09Z

+    def skip_write_after_current_write(self, version: str) -> bool:
+        # Newly written IVF_RQ indexes carry raw-query estimator metadata and
+        # split-code schema that older runtimes can query but cannot optimize.
+        # The upgrade_downgrade variant still covers old 1-bit residual-query
+        # indexes being read and rewritten by the current runtime.
+        return True


🔴 The new skip_write_after_current_write override (test_vector_indices.py:277-282) leaves skip_read_after_current_write at the default False, so the downgrade flow still runs the old runtime's check_read against the newly-written num_bits=1 IVF_RQ index — but builder.rs::new_with_rotation now unconditionally writes query_estimator: RawQuery (lines 240, 251), a field old serde silently drops, so the old runtime applies the legacy ResidualQuery distance formula to __add_factors/__scale_factors that were computed for the RawQuery formula. check_read only asserts result.num_rows == 4 and never validates rankings, so the test passes vacuously while real downgraded users see silent recall regression and the PR-author comment claiming "older runtimes can query the new indexes (just cannot optimize)" is wrong for the num_bits=1 path. Fix: either also override skip_read_after_current_write to True (and remove the misleading comment), or keep ResidualQuery as the default query_estimator for num_bits=1 so the on-disk meaning of __add_factors/__scale_factors stays compatible with released runtimes (the num_bits>1 path was already rejected by old runtimes via the removed validate_supported_rq_num_bits gate).

Extended reasoning...

What the bug is

The PR adds skip_write_after_current_write = True to IvfRqVectorIndex (test_vector_indices.py:277-282) with a comment claiming "older runtimes can query the new indexes (just cannot optimize)." That premise is false for num_bits=1, and the downgrade direction is therefore exercising a silently-broken read path that the compat test cannot catch.

In compat_decorator.py:340-343, the downgrade flow is now gated as:

if not obj.skip_read_after_current_write(version): venv.execute_method(obj, "check_read", obj.compat_env(version, "check_read")) if not obj.skip_write_after_current_write(version): venv.execute_method(obj, "check_write", obj.compat_env(version, "check_write"))

IvfRqVectorIndex overrides only skip_write_after_current_write (returns True). skip_read_after_current_write inherits the default False from UpgradeDowngradeTest (compat_decorator.py:152-154), so the old venv still runs check_read against the index written by the current runtime.

The on-disk meaning of factor columns changed

builder.rs::RabitQuantizer::new_with_rotation unconditionally sets query_estimator: RabitQueryEstimator::RawQuery in both the Matrix branch (line 240) and the Fast branch (line 251) — there is no num_bits=1 carve-out. So every newly-built IVF_RQ index, including num_bits=1, ships with the raw-query estimator and the new RawQuery factor formula:

// transform.rs (raw-query, num_bits=1, L2) scale = -2 * |res|^2 / binary_res_dot // binary_res_dot = 0.5 * sum(|rotated|) add = |res|^2 + 2 * binary_correction // binary_correction = |res|^2 * binary_cent_dot / binary_res_dot

The released runtime's transform writes:

// transform.rs (residual-query, L2) scale = -2 * |res|^2 / ip_rq_res // ip_rq_res = sum(|rotated|) / sqrt(d) add = |res|^2

binary_res_dot and ip_rq_res are related by binary_res_dot = (sqrt(d) / 2) * ip_rq_res, so the new scale magnitude is (2 / sqrt(d)) times the legacy magnitude. add gains an extra 2 * binary_correction term that depends on per-row residual signs vs the rotated centroid — a row-dependent perturbation that breaks rank ordering, not just a constant offset.

The old runtime parses the new index as if it were a residual-query index

The pre-PR RabitQuantizationMetadata struct has no query_estimator field, and there is no #[serde(deny_unknown_fields)] (verified in storage.rs:220 and earlier release commits). Serde silently ignores the new "query_estimator":"raw_query" key. The old runtime then plugs the on-disk factor values — which now mean RawQuery — into the legacy ResidualQuery formula in storage.rs::distance():

let dist_vq_qr = (2.0 * dist - sum_q) / sqrt_d; dist_vq_qr * scale + add + query_factor

while the new code applies (dist - 0.5 * sum_q) * scale + add + query_factor to the same column values. The old code also residualizes the query before building dist_table (via QueryResidual::Centroid), while the raw-query factors are computed assuming a raw (un-residualized) query. The new extra column __error_factors and the new metadata field are both silently ignored by the old reader (column_by_name tolerates extras and old metadata struct has no field for them), so the index loads cleanly and just returns wrong distances.

Why check_read does not catch it

IvfRqVectorIndex.check_read (test_vector_indices.py:311-340) asserts:

assert result.num_rows == 4 # plus num_indexed_rows > 0 and (under current runtime) num_bits == 1

No ground-truth k-NN comparison, no recall threshold, no row-id verification. The IVF search still returns 4 rows when distances are completely wrong — they are just the wrong 4 rows. CLAUDE.md:111 mandates a >=0.5 recall threshold for vector index tests precisely to catch this class of silent regression, and the compat test was the place where it should have triggered.

Step-by-step proof

Build a num_bits=1 IVF_RQ index on the current branch. The metadata JSON includes "query_estimator":"raw_query" and the aux file gains an __error_factors column.

transform.rs::transform takes the new branch (query_estimator != ResidualQuery) and writes __add_factors/__scale_factors via compute_raw_query_factors — values that the new storage.rs::distance() RawQuery branch consumes as (binary_dot - 0.5 * sum_q) * scale + add + query_factor.

The downgrade venv (pre-PR) opens the same aux file. Its RabitQuantizationMetadata deserialize ignores query_estimator (no deny_unknown_fields) and ignores __error_factors (only named columns are pulled).

Its distance() applies the legacy formula (2 * dist - sum_q) / sqrt_d * scale + add + query_factor to columns whose magnitudes are off by 2/sqrt(d) in scale and disturbed by the row-dependent 2 * binary_correction in add. For dim=32 (the compat dataset), 2/sqrt(32) ≈ 0.354, so the scale magnitudes are roughly a third of what the old formula expects.

check_read issues a 4-NN search and asserts result.num_rows == 4. The IVF/FLAT pipeline returns four rows from the probed partition regardless of distance values, so the assertion holds and the test passes — but the rows returned are not the four nearest under either metric.

How to fix

Either option from the original report works and is small:

(a) Override skip_read_after_current_write to True in IvfRqVectorIndex, and replace the misleading comment with an explicit note that downgrade-direction reads are not supported once a num_bits=1 index is rewritten by current. This honestly marks the compat regression rather than papering over it.

(b) Keep ResidualQuery as the default query_estimator when num_bits == 1 in builder.rs::new_with_rotation (only switch to RawQuery for num_bits > 1, which old runtimes already rejected via validate_supported_rq_num_bits before this PR removed that gate). The binary-only on-disk layout then stays bit-for-bit compatible with released runtimes, and the PR's stated compat claim becomes true.

Optionally combine with strengthening check_read to assert recall against an exhaustive top-k so this class of silent regression cannot regress in the future.

claude Bot reviewed Jun 3, 2026

View reviewed changes

github-actions Bot added A-python Python bindings A-index Vector index, linalg, tokenizer A-format On-disk format: protos and format spec docs labels Jun 3, 2026

github-actions Bot added the enhancement New feature or request label Jun 3, 2026

Base automatically changed from yang/ivfrq-num-bits-pr2-bench to main June 5, 2026 06:09

BubbleCal added 8 commits June 8, 2026 13:23

feat(index): support raw-query ivf rq search

707da7e

feat(index): add ivf rq raw-query factors

a77fb44

perf(index): add ivf rq multi-bit raw-query fastscan

fbd2e71

perf(index): gate ivf rq ex-code boosting

cdc3375

perf(index): apply ivf rq gating to prefilter

761941e

fix(index): preserve ivf rq error factors

325d7ba

chore(index): clean ivf rq test import

a167031

perf(index): optimize ivf rq raw-query search

763ecc0

BubbleCal force-pushed the yang/ivfrq-pr3-split-code-query branch from ce548a4 to 763ecc0 Compare June 8, 2026 05:37

test(index): update ivf rq cosine expectations

42b122a

chatgpt-codex-connector Bot reviewed Jun 8, 2026

View reviewed changes

Comment thread rust/lance-index/src/vector/bq/builder.rs

claude Bot reviewed Jun 8, 2026

View reviewed changes

BubbleCal added 3 commits June 8, 2026 17:15

fix(index): reuse supplied ivf rq rotation

b7eb776

fix(linalg): handle odd dist table code lengths

266b784

test(lance): stabilize child input stream timing

221301b

claude Bot reviewed Jun 8, 2026

View reviewed changes

Comment thread rust/lance-linalg/src/simd/dist_table.rs

Comment thread python/python/tests/test_vector_index.py Outdated

Comment thread docs/src/format/index/vector/index.md

claude Bot reviewed Jun 8, 2026

View reviewed changes

Comment thread rust/lance/src/index/vector/ivf/v2.rs

fix(index): address ivf rq review feedback

4910e17

claude Bot reviewed Jun 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(index): support raw-query ivf rq search#7078

feat(index): support raw-query ivf rq search#7078
BubbleCal wants to merge 13 commits into
mainfrom
yang/ivfrq-pr3-split-code-query

BubbleCal commented Jun 3, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

BubbleCal commented Jun 8, 2026

Uh oh!

BubbleCal commented Jun 8, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

claude Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

claude Bot Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BubbleCal commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Feature

Compatibility

Performance Improvement

Tests

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

BubbleCal commented Jun 8, 2026

Uh oh!

BubbleCal commented Jun 8, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

claude Bot Jun 8, 2026

Choose a reason for hiding this comment

What the bug is

The on-disk meaning of factor columns changed

The old runtime parses the new index as if it were a residual-query index

Why check_read does not catch it

Step-by-step proof

How to fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

BubbleCal commented Jun 3, 2026 •

edited

Loading

codecov Bot commented Jun 8, 2026 •

edited

Loading

Why `check_read` does not catch it