Quant preprocessor fp16 [MOD-15269] by dor-forer · Pull Request #944 · RedisAI/VectorSimilarity

dor-forer · 2026-04-30T10:29:20Z

Describe the changes in the pull request

Adds float16 (FP16) input support to QuantPreprocessor, completing the producer side of the asymmetric SQ8 / FP16 hybrid layout consumed by the kernels added in #942.

Storage layout (unchanged): [uint8 × dim] [float × N].
Query layout for FP16 input: [float16 × dim] [float × M]. FP32 path is byte-for-byte unchanged.
All SQ8 metadata (min, delta, sum, sum_squares) is pinned to FP32 regardless of input type.
FP16 inputs are widened to FP32 before accumulation to limit precision loss.
Metadata I/O uses memcpy to handle the 2-byte-aligned offset that occurs when dim is odd and DataType is FP16.
QuantPreprocessor's DataType is now constrained by a C++20 QuantInput concept (float or vecsim_types::float16).

Which issues this PR fixes

MOD-15269

Main objects this PR modified

QuantPreprocessor and the to_fp32 helper in src/VecSim/spaces/computer/preprocessors.h.
tests/unit/test_components.cpp — new QuantPreprocessorFP16MetricTest (L2 / IP / Cosine) and QuantPreprocessorFP16Test.QuantizeReconstructRoundTripL2 (odd-dim, in-place + out-of-place).
tests/unit/unit_test_utils.h — ComputeSQ8Quantization baseline updated to use memcpy for metadata writes.

Mark if applicable

This PR introduces API changes
This PR introduces serialization changes

Note

Medium Risk
Changes quantization preprocessing to support float16 inputs and alters storage/query blob layouts by pinning all SQ8 metadata to FP32 and writing it via memcpy, which could impact correctness/ABI expectations of downstream distance kernels if assumptions differ.

Overview
Adds vecsim_types::float16 support to QuantPreprocessor via a C++20 QuantInput constraint, FP16->FP32 widening (to_fp32) for min/max and accumulation, and a fixed FP32 metadata type for both storage and query blobs.

Updates blob sizing/layout so queries keep the original element width (FP16 or FP32) while all SQ8 metadata is written/read as FP32, using memcpy helpers to handle potentially unaligned metadata offsets (e.g., odd dim with FP16).

Extends unit coverage with new parameterized FP16 tests (L2/IP/Cosine) validating hybrid layout, metadata equivalence to an FP32 baseline, round-trip reconstruction bounds, and in-place quantization; test quantization utility now also writes metadata via memcpy for alignment safety.

^{Reviewed by Cursor Bugbot for commit b987af2. Bugbot is set up for automated code reviews on this repo. Configure here.}

Refactor QuantPreprocessor so DataType may be either float or vecsim_types::float16. The vector body is stored/queried in DataType width while all SQ8 metadata (min, delta, sum, sum_squares) remains FP32 to match the asymmetric distance kernels added in MOD-15141. - Replace std::is_floating_point_v<DataType> constraint with an is_quant_input trait that opts in float and float16 explicitly. - Pin metadata to a single MetadataType alias (float) and use it for every value that is written into the metadata region (min/max/diff/ delta, final sum/sum_squares, store_*_metadata signatures, and the return type of find_min_max). Per-element SIMD-lane accumulators remain plain float as a precision choice for FP16 inputs. - Compute storage_bytes_count and query_bytes_count using sizeof(MetadataType) for the metadata region and sizeof(DataType) for the vector body. - Write metadata via memcpy because the metadata offset is not guaranteed to be sizeof(MetadataType)-aligned when DataType is float16 (e.g. odd dim). - Update the class-level documentation to reflect the layout and precision contract. Tests: - Add QuantPreprocessorFP16MetricTest (L2/IP/Cosine) covering blob size, layout, FP16 body fidelity, and FP32 metadata against an FP32 baseline computed from the widened input. - Add QuantPreprocessorFP16Test.QuantizeReconstructRoundTripL2 with an odd dimension (17) verifying |min + delta * q - x| <= delta. - Switch the FP16 fixture's baseline metadata reads to the existing load_meta() memcpy helper, and rewrite ComputeSQ8Quantization to store metadata via memcpy as well.

The test comment claimed it 'covers the in-place quantization path' but only called preprocessForStorage. Add an actual preprocessStorageInPlace call: seed a buffer sized to max(input_size, storage_size) with the FP16 input and verify the resulting SQ8 blob byte-for-byte matches the one produced by preprocessForStorage.

…onents

jit-ci · 2026-04-30T10:30:52Z

🛡️ Jit Security Scan Results

✅ No security findings were detected in this PR

^{Security scan by Jit}

Use a C++20 named concept on the QuantPreprocessor template head and on the to_fp32 helper instead of an is_quant_input trait combined with a static_assert. The constraint is now checked at the template's instantiation point with a named diagnostic, and the helper trait machinery is dropped.

…onversion

…sion

Copilot

Pull request overview

Adds FP16 (float16) input support to QuantPreprocessor so SQ8 storage blobs can be produced from FP16 inputs while keeping all SQ8 metadata in FP32 and safely handling potentially unaligned metadata regions.

Changes:

Constrain QuantPreprocessor input types to float / vecsim_types::float16, widen FP16 to FP32 for accumulation, and store all metadata as FP32.
Use memcpy for storage/query metadata writes to avoid UB with unaligned metadata offsets (notably for odd dims and FP16 query bodies).
Add unit tests validating FP16 storage/query layout correctness, metadata equivalence vs FP32-widened baseline, and round-trip reconstruction (including odd-dim + in-place path).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
`src/VecSim/spaces/computer/preprocessors.h`	Implements FP16 input support, FP32 metadata, FP16→FP32 widening for accumulation, and `memcpy`-based metadata I/O for alignment safety.
`tests/unit/test_components.cpp`	Adds FP16-focused `QuantPreprocessor` layout/metadata tests and an odd-dimension in-place reconstruction round-trip test.
`tests/unit/unit_test_utils.h`	Updates SQ8 quantization test baseline to write metadata via `memcpy` to support unaligned metadata regions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov · 2026-04-30T12:39:37Z

Codecov Report

❌ Patch coverage is 98.21429% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 96.88%. Comparing base (5bcc53e) to head (b987af2).

Files with missing lines	Patch %	Lines
src/VecSim/spaces/computer/preprocessors.h	98.21%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #944      +/-   ##
==========================================
+ Coverage   96.71%   96.88%   +0.17%     
==========================================
  Files         129      129              
  Lines        8057     7675     -382     
==========================================
- Hits         7792     7436     -356     
+ Misses        265      239      -26

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dor-forer added 3 commits April 30, 2026 13:13

Refactor code for improved readability in preprocessors and test comp…

cb597dc

…onents

dor-forer added 3 commits April 30, 2026 14:45

Refactor QuantPreprocessor to use 'auto' for type inference in FP32 c…

c810e6a

…onversion

Refactor QuantPreprocessor to use explicit float type for FP32 conver…

b987af2

…sion

dor-forer marked this pull request as ready for review April 30, 2026 12:19

dor-forer requested a review from Copilot April 30, 2026 12:19

Copilot started reviewing on behalf of dor-forer April 30, 2026 12:20 View session

Copilot AI reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quant preprocessor fp16 [MOD-15269] #944

Quant preprocessor fp16 [MOD-15269] #944
dor-forer wants to merge 6 commits intomainfrom
feat/MOD-15269-quant-preprocessor-fp16

dor-forer commented Apr 30, 2026 •

edited by cursor Bot

Loading

Uh oh!

jit-ci Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

codecov Bot commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dor-forer commented Apr 30, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jit-ci Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛡️ Jit Security Scan Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

codecov Bot commented Apr 30, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dor-forer commented Apr 30, 2026 •

edited by cursor Bot

Loading

jit-ci Bot commented Apr 30, 2026 •

edited

Loading