Open
Conversation
Refactor QuantPreprocessor so DataType may be either float or vecsim_types::float16. The vector body is stored/queried in DataType width while all SQ8 metadata (min, delta, sum, sum_squares) remains FP32 to match the asymmetric distance kernels added in MOD-15141. - Replace std::is_floating_point_v<DataType> constraint with an is_quant_input trait that opts in float and float16 explicitly. - Pin metadata to a single MetadataType alias (float) and use it for every value that is written into the metadata region (min/max/diff/ delta, final sum/sum_squares, store_*_metadata signatures, and the return type of find_min_max). Per-element SIMD-lane accumulators remain plain float as a precision choice for FP16 inputs. - Compute storage_bytes_count and query_bytes_count using sizeof(MetadataType) for the metadata region and sizeof(DataType) for the vector body. - Write metadata via memcpy because the metadata offset is not guaranteed to be sizeof(MetadataType)-aligned when DataType is float16 (e.g. odd dim). - Update the class-level documentation to reflect the layout and precision contract. Tests: - Add QuantPreprocessorFP16MetricTest (L2/IP/Cosine) covering blob size, layout, FP16 body fidelity, and FP32 metadata against an FP32 baseline computed from the widened input. - Add QuantPreprocessorFP16Test.QuantizeReconstructRoundTripL2 with an odd dimension (17) verifying |min + delta * q - x| <= delta. - Switch the FP16 fixture's baseline metadata reads to the existing load_meta() memcpy helper, and rewrite ComputeSQ8Quantization to store metadata via memcpy as well.
The test comment claimed it 'covers the in-place quantization path' but only called preprocessForStorage. Add an actual preprocessStorageInPlace call: seed a buffer sized to max(input_size, storage_size) with the FP16 input and verify the resulting SQ8 blob byte-for-byte matches the one produced by preprocessForStorage.
🛡️ Jit Security Scan Results✅ No security findings were detected in this PR
Security scan by Jit
|
Use a C++20 named concept on the QuantPreprocessor template head and on the to_fp32 helper instead of an is_quant_input trait combined with a static_assert. The constraint is now checked at the template's instantiation point with a named diagnostic, and the helper trait machinery is dropped.
Contributor
There was a problem hiding this comment.
Pull request overview
Adds FP16 (float16) input support to QuantPreprocessor so SQ8 storage blobs can be produced from FP16 inputs while keeping all SQ8 metadata in FP32 and safely handling potentially unaligned metadata regions.
Changes:
- Constrain
QuantPreprocessorinput types tofloat/vecsim_types::float16, widen FP16 to FP32 for accumulation, and store all metadata as FP32. - Use
memcpyfor storage/query metadata writes to avoid UB with unaligned metadata offsets (notably for odd dims and FP16 query bodies). - Add unit tests validating FP16 storage/query layout correctness, metadata equivalence vs FP32-widened baseline, and round-trip reconstruction (including odd-dim + in-place path).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/VecSim/spaces/computer/preprocessors.h |
Implements FP16 input support, FP32 metadata, FP16→FP32 widening for accumulation, and memcpy-based metadata I/O for alignment safety. |
tests/unit/test_components.cpp |
Adds FP16-focused QuantPreprocessor layout/metadata tests and an odd-dimension in-place reconstruction round-trip test. |
tests/unit/unit_test_utils.h |
Updates SQ8 quantization test baseline to write metadata via memcpy to support unaligned metadata regions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #944 +/- ##
==========================================
+ Coverage 96.71% 96.88% +0.17%
==========================================
Files 129 129
Lines 8057 7675 -382
==========================================
- Hits 7792 7436 -356
+ Misses 265 239 -26 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Describe the changes in the pull request
Adds
float16(FP16) input support toQuantPreprocessor, completing the producer side of the asymmetric SQ8 / FP16 hybrid layout consumed by the kernels added in #942.[uint8 × dim] [float × N].[float16 × dim] [float × M]. FP32 path is byte-for-byte unchanged.min,delta,sum,sum_squares) is pinned to FP32 regardless of input type.memcpyto handle the 2-byte-aligned offset that occurs whendimis odd andDataTypeis FP16.QuantPreprocessor'sDataTypeis now constrained by a C++20QuantInputconcept (floatorvecsim_types::float16).Which issues this PR fixes
Main objects this PR modified
QuantPreprocessorand theto_fp32helper insrc/VecSim/spaces/computer/preprocessors.h.tests/unit/test_components.cpp— newQuantPreprocessorFP16MetricTest(L2 / IP / Cosine) andQuantPreprocessorFP16Test.QuantizeReconstructRoundTripL2(odd-dim, in-place + out-of-place).tests/unit/unit_test_utils.h—ComputeSQ8Quantizationbaseline updated to usememcpyfor metadata writes.Mark if applicable
Note
Medium Risk
Changes quantization preprocessing to support
float16inputs and alters storage/query blob layouts by pinning all SQ8 metadata to FP32 and writing it viamemcpy, which could impact correctness/ABI expectations of downstream distance kernels if assumptions differ.Overview
Adds
vecsim_types::float16support toQuantPreprocessorvia a C++20QuantInputconstraint, FP16->FP32 widening (to_fp32) for min/max and accumulation, and a fixed FP32 metadata type for both storage and query blobs.Updates blob sizing/layout so queries keep the original element width (FP16 or FP32) while all SQ8 metadata is written/read as FP32, using
memcpyhelpers to handle potentially unaligned metadata offsets (e.g., odddimwith FP16).Extends unit coverage with new parameterized FP16 tests (L2/IP/Cosine) validating hybrid layout, metadata equivalence to an FP32 baseline, round-trip reconstruction bounds, and in-place quantization; test quantization utility now also writes metadata via
memcpyfor alignment safety.Reviewed by Cursor Bugbot for commit b987af2. Bugbot is set up for automated code reviews on this repo. Configure here.