[claude] Clean up `vortex-tensor` by connortsui20 · Pull Request #7524 · vortex-data/vortex

connortsui20 · 2026-04-18T03:12:07Z

TODO

The `Display` impl emitted `]` as a closing bracket without a matching opener and without a separator from the shape list, producing output like `Tensor(2, 3, 41, 0, 2])` instead of `Tensor(2, 3, 4, [1, 0, 2])`. Emit the opening `, [` so the permutation is printed as a bracketed list after the shape, and add regression tests covering all four display cases (shape-only, 0d scalar, dim names, permutation, and the combination of dim names + permutation). Signed-off-by: Claude <noreply@anthropic.com>

- Standardize `ScalarFnId::from("...")` on `ScalarFnId::new("...")` so all five tensor scalar functions construct their id the same way (l2_denorm and sorf_transform already used `::new`). - Fix a typo in the `L2Norm` readthrough expect message that said "L2Denom must have at 2 children". No behavior change; the two methods are equivalent aside from wording. Signed-off-by: Claude <noreply@anthropic.com>

`build_quantized_fsl` was constructing a `BufferMut`, copying the centroid slice into it via `extend_from_slice`, and then freezing it into a `Buffer`. `Buffer::copy_from` does the same thing in one call. Signed-off-by: Claude <noreply@anthropic.com>

…lper - Introduce `DenormOrientation::classify`, a small enum helper that normalises a `(lhs, rhs)` operand pair into `Both`, `One { denorm, plain }`, or `Neither`. Symmetric binary tensor operators (`CosineSimilarity`, `InnerProduct`) use it instead of hand-rolling "swap lhs and rhs if only the rhs is denormalised" at each call site. - Move the `l2_denorm_array` test helper — previously duplicated byte-for-byte between `cosine_similarity` and `inner_product` tests — into `utils::test_helpers` so both suites share one implementation and threading an `ExecutionCtx` through is explicit. - Fix "have be" -> "have been" in two comments and replace the stale "four independent accumulators" comment in `execute_dict_constant_inner_product` with a reference to the `PARTIAL_SUMS` constant (currently 8). Signed-off-by: Claude <noreply@anthropic.com>

`FlatElements` used a `stride: usize` field that was either `0` (single stored row, used for `ConstantArray` storage) or `list_size` (full row per index). `row(i)` then multiplied the index by `stride`, which works but hides the real invariant — "stride is 0 xor stride == list_size". Store the invariant directly: - `is_constant: bool` replaces the tri-value `stride` field. - `row(i)` selects `0` or `i` via a single branch, so the two cases compile to the same code as before but the struct's semantics are self-describing. - `extract_flat_elements` collapses the duplicated `FixedSizeListArray::execute` + `PrimitiveArray::execute` path into a single tail block, rewriting the source once up front based on whether the storage is constant-backed. Signed-off-by: Claude <noreply@anthropic.com>

- `fmt_sql` now prints the [`SorfOptions`] next to the child expression (e.g. `sorf_transform(col, SorfOptions(seed=42, rounds=3, dim=100, ptype=f32))`) so the SQL form uniquely identifies the transform instead of collapsing distinct configurations to the same string. - Factor `SorfTransformMetadata <-> SorfOptions` conversion into an `impl From<&SorfOptions>` and a `to_options()` method. `serialize` and `deserialize` become thin wrappers and the `num_rounds: u32 -> u8` range-check plus `validate_sorf_options` call live in exactly one place. Signed-off-by: Claude <noreply@anthropic.com>

The full TurboQuant compression pipeline — normalize a tensor via `normalize_as_l2_denorm`, quantize the normalized child via `turboquant_encode_unchecked`, and reattach the stored norms as an outer `L2Denorm` wrapper — was spelled out three times: in `TurboQuantScheme::compress`, in `vector_search::compress_turboquant`, and in the `normalize_and_encode` test helper. Add `turboquant_compress(input, config, ctx)` alongside the existing encode helpers and use it in all three call sites. Each caller collapses to a single line that threads its own config through (the scheme and `compress_turboquant` keep the `TurboQuantConfig::default()` choice, and the test helper now takes an explicit config). Signed-off-by: Claude <noreply@anthropic.com>

- Extract `validate_binary_tensor_float_inputs(op_name, lhs, rhs)` for the common "two operands must share the same float tensor dtype" precondition. `CosineSimilarity::return_dtype` and `InnerProduct::return_dtype` now both call it instead of hand-rolling the `eq_ignore_nullability` check, the extension-type downcast, the `AnyTensor` metadata lookup, and the float element-type check. The old `InnerProduct` path also had a redundant `is::<AnyTensor>()` plus `metadata_opt::<AnyTensor>()` pair — both collapse to the one `metadata_opt` call inside `validate_tensor_float_input`. - Add `VectorMatcherMetadata::list_size() -> usize` so it mirrors `FixedShapeTensorMatcherMetadata::list_size`. `TensorMatch::list_size` no longer needs the inline `as usize` cast for the `Vector` arm. - Drop the unused `_ctx` parameter from `CosineSimilarity::execute_both_denorm`; it never used the context since the both-denorm path is pure array composition. - Fix the assert message in `vector::matcher` that said "element dtype must be primitive" — the check is `element_dtype.is_float()`, so the message now says "must be float". Signed-off-by: Claude <noreply@anthropic.com>

The two-line `ExtDType::<Vector>::try_new(EmptyMetadata, storage.dtype().clone()) + ExtensionArray::new(ext_dtype, storage).into_array()` incantation for wrapping a storage array in a [`Vector`] extension appeared verbatim in: - `compress::wrap_padded_as_vector` (private, used twice) - `sorf_transform::vtable` (empty-array branch + `inverse_rotate_typed`) - `vector_search::build_constant_query_vector` Promote it to `Vector::wrap_storage(storage)`, an associated function on the [`Vector`] vtable struct that is the natural home for the operation. Each call site drops to a single line and the `ExtDType`/`EmptyMetadata` imports go away where they were only pulled in for this pattern. The old private helper is deleted. Signed-off-by: Claude <noreply@anthropic.com>

codspeed-hq · 2026-04-18T03:15:44Z

Merging this PR will not alter performance

✅ 1163 untouched benchmarks
⏩ 1457 skipped benchmarks¹

_{Comparing claude/cleanup-vortex-tensor-KRYsv (e3ee206) with develop (4135209)}

1457 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

The `normalize_and_encode` helper was a one-line forwarder to `turboquant_compress` after the pipeline was extracted. Inline the call at each test site (passing `ext.into_array()` since `ext` was never used afterwards, except for one test with `as_ref().clone()`) and delete the wrapper. Signed-off-by: Claude <noreply@anthropic.com>

A pile of test-only and convenience wrappers were thin layers on top of the shared building blocks introduced in the previous commits. Remove them and route every call site through the shared helpers. - Drop `vector_search::compress_turboquant`: a one-liner around `turboquant_compress(data, &TurboQuantConfig::default(), ctx)`. Update the module-level docstring example and the one test caller. - Make `utils::test_helpers::{tensor_array, vector_array, constant_tensor_array, constant_vector_array, l2_denorm_array}` generic over `T: NativePType` (some also `+ Into<PValue>`) and refactor their bodies through two new private helpers (`flat_fsl`, `fsl_scalar`) so the FSL-construction and constant-FSL-scalar logic each lives in one place. - Add `utils::test_helpers::literal_vector_array<T>` for the second Vector-literal shape (`ConstantArray<Extension(Vector) Scalar>`) that was duplicated in `l2_norm` (`constant_vector_extension_array`) and `inner_product` (`literal_vector_f32`). Both are deleted. - In `inner_product::tests::constant_query_optimizations`: drop the module-private `vector_f32` / `constant_vector_f32` helpers in favour of the now-generic shared ones, simplify `dict_vector_f32` via `Vector::wrap_storage`, and switch the local `LazyLock<VortexSession>` to the crate-wide `tests::SESSION`. - In `l2_denorm` tests: delete `integer_tensor_array` (just `tensor_array(shape, &[i32; ...])` now) and `f16_vector_array` (the per-element `f16::from_f32` map fits inline at the one call site). - In `vector_search` tests: delete the duplicated `vector_array` (use the shared one, now f32-friendly) and the `test_session()` helper (use `crate::tests::SESSION`). - In `turboquant::tests`: change `make_vector_ext` to return `ArrayRef` via `Vector::wrap_storage` instead of materialising an `ExtensionArray`, drop the trivially redundant `unwrap_sorf` and `make_fsl_small` helpers, and inline `unwrap_sorf` at its single call site inside `unwrap_codes_centroids_norms`. Replace `ext.into_array()` and `ext.as_ref().clone()` with the now-equivalent `ext` and `ext.clone()` everywhere. Net diff: ~170 lines removed. Signed-off-by: Claude <noreply@anthropic.com>

Continue the wrapper-removal pass: - Add `Vector::constant_array<T>(elements, len)` next to `wrap_storage` on the `Vector` impl. It captures the "broadcast a single vector value across `len` rows" pattern that previously had three duplicate implementations (the public `vector_search::build_constant_query_vector`, the test helper `utils::test_helpers::constant_vector_array`, and an inline build inside `inner_product` tests). Drop both the public `build_constant_query_vector` and the `constant_vector_array` test helper; every call site now uses `Vector::constant_array` directly. - Collapse the trivially nested `validate_l2_normalized_rows_impl` and its `validate_l2_denorm_children` adapter into a single descriptive function `validate_l2_normalized_rows_against_norms(normalized, norms, ctx)`. `validate_l2_normalized_rows` now calls it with `None` and `try_new_array` calls it with `Some(&norms)` directly. - Simplify the `try_execute_sorf_constant` doc-block: replace the two TODO comments that said the same thing in different words plus the stale `vtable.rs line ~218` reference with a single "F32-only" doc section above the function. - Rewrite the `encodings::turboquant` module-level doctest to use `turboquant_compress` + `Vector::wrap_storage` instead of manually constructing the FSL/extension dtype and stitching the normalize/encode pipeline. - Update two stale comments that still said "stride-0" after `FlatElements` switched to the explicit `is_constant` flag. - Remove the redundant `ext.clone()` calls in `normalize_as_l2_denorm(ext, ...)` test sites that clippy flagged. Net diff: ~30 more lines removed. Signed-off-by: Claude <noreply@anthropic.com>

The `L2Norm` execute path that extracts the stored norms from an `L2Denorm` wrapper was hand-rolling the `as_opt::<ExactScalarFn<L2Denorm>>()` + `nth_child(1).vortex_expect(...)` dance instead of using the shared `extract_l2_denorm_children` helper that already encodes the same expectations. Switch to the helper. Signed-off-by: Claude <noreply@anthropic.com>

`InnerProduct::try_execute_sorf_constant` was hand-rolling the `ExtDType::<Vector>::try_new(EmptyMetadata, FSL(...)) + ConstantArray::new(extension_ref(...), len)` dance to wrap the forward-rotated query as a `Vector<padded_dim, f32>` constant. That same shape is exactly what `Vector::constant_array` produces, so collapse the 20-line builder down to a single call and drop the now-unused `Arc`/`ExtDType`/`EmptyMetadata`/`Scalar` imports. Also: - Refresh the `encodings::turboquant::scheme` module docstring so it describes the post-extraction reality (the scheme is a thin adapter over `turboquant_compress`) instead of the pre-extraction normalize + encode pipeline. - Add a context message to the previously bare `vortex_ensure!` that guards the `L2Denorm` constant-norms ptype check. - Replace the misleading "with unchecked indexing" comment on the `try_execute_dict_constant` hot loop with one that matches what the helper actually does (chunked indices that LLVM can prove in-bounds). Signed-off-by: Claude <noreply@anthropic.com>

claude added 9 commits April 18, 2026 02:03

connortsui20 changed the title ~~Fix FixedShapeTensorMetadata Display for permuted tensors~~ [claude] Clean up vortex-tensor Apr 18, 2026

connortsui20 added the changelog/chore A trivial change label Apr 18, 2026

claude added 5 commits April 18, 2026 13:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[claude] Clean up `vortex-tensor`#7524

[claude] Clean up `vortex-tensor`#7524
connortsui20 wants to merge 14 commits intodevelopfrom
claude/cleanup-vortex-tensor-KRYsv

connortsui20 commented Apr 18, 2026 •

edited

Loading

Uh oh!

codspeed-hq bot commented Apr 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

connortsui20 commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

connortsui20 commented Apr 18, 2026 •

edited

Loading

codspeed-hq bot commented Apr 18, 2026 •

edited

Loading