Conversation
🛡️ Jit Security Scan Results✅ No security findings were detected in this PR
Security scan by Jit
|
❌ Jit Scanner failed - Our team is investigatingJit Scanner failed - Our team has been notified and is working to resolve the issue. Please contact support if you have any questions. 💡 Need to bypass this check? Comment |
There was a problem hiding this comment.
Pull request overview
Adds asymmetric SQ8-storage → FP16-query distance support (scalar path) for L2, Inner Product, and Cosine in VecSim, including dispatch plumbing plus unit tests and benchmarks.
Changes:
- Implement scalar SQ8↔FP16 kernels for IP/Cosine and L2² using stored/query FP32 metadata (sum / sum-of-squares) and algebraic identities.
- Add SQ8→FP16
GetDistFuncspecialization and metric-specific dispatchers (currently scalar-only; SIMD chooser slots reserved). - Add test utilities, unit tests (including edge cases + parameterized dimensions), and a new benchmark target for SQ8→FP16 spaces.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/VecSim/spaces/IP/IP.cpp |
Adds SQ8→FP16 inner-product implementation used by IP/Cosine/L2. |
src/VecSim/spaces/IP/IP.h |
Declares SQ8→FP16 IP/Cosine APIs and shared IP implementation. |
src/VecSim/spaces/L2/L2.cpp |
Adds SQ8→FP16 L2² implementation using precomputed norms + IP. |
src/VecSim/spaces/L2/L2.h |
Declares SQ8→FP16 L2² API. |
src/VecSim/spaces/IP_space.cpp |
Adds SQ8→FP16 dispatcher stubs returning scalar implementations. |
src/VecSim/spaces/IP_space.h |
Declares SQ8→FP16 IP/Cosine dispatcher APIs. |
src/VecSim/spaces/L2_space.cpp |
Adds SQ8→FP16 L2 dispatcher stub returning scalar implementation. |
src/VecSim/spaces/L2_space.h |
Declares SQ8→FP16 L2 dispatcher API. |
src/VecSim/spaces/spaces.cpp |
Wires GetDistFunc<sq8, float, float16> to the new dispatchers. |
tests/utils/tests_utils.h |
Adds SQ8→FP16 query population/preprocess helpers and baselines. |
tests/unit/test_spaces.cpp |
Adds SQ8→FP16 correctness tests (basic, getters, param dims, edge cases). |
tests/benchmark/spaces_benchmarks/bm_spaces_sq8_fp16.cpp |
Introduces SQ8→FP16 scalar benchmarks for L2/IP/Cosine. |
tests/benchmark/CMakeLists.txt |
Registers the new bm_spaces_sq8_fp16 benchmark target. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 68703d4. Configure here.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #942 +/- ##
==========================================
+ Coverage 96.71% 96.93% +0.22%
==========================================
Files 129 130 +1
Lines 8057 7712 -345
==========================================
- Hits 7792 7476 -316
+ Misses 265 236 -29 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ccess - Add load_unaligned<T> helper to VecSim/utils/alignment.h. - Refactor SQ8_FP16_InnerProduct_Impl and SQ8_FP16_L2Sqr to use it for the misaligned FP32 metadata reads, replacing inline std::memcpy. - Apply the same helper in tests/utils/tests_utils.h reference helpers (SQ8_FP16_NotOptimized_InnerProduct, SQ8_FP16_NotOptimized_L2Sqr), fixing the same alignment UB pattern that remained in the test path. - Add SQ8_FP16_l2sqr_odd_dim_unaligned_metadata_test that exercises the L2 kernel with deterministically misaligned storage and query metadata addresses (asserts the metadata bytes are not 4-byte aligned before invoking the kernel).
…6 implementations
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address review comments on SQ8_FP16 test alignment: - Allocate FP16 query buffers in test_spaces.cpp / bm_spaces_sq8_fp16.cpp as std::vector<float16> / new float16[] (with extra slots for FP32 metadata) so the production SQ8_FP16_* kernels' typed float16* loads never see a misaligned pointer. - Harden the test-only helpers (populate/preprocess_sq8_fp16_query and SQ8_FP16_NotOptimized_*) to access float16 values via memcpy on uint16_t, so they remain safe under any caller alignment.
| template <typename T> | ||
| static inline T load_unaligned(const void *ptr) { | ||
| static_assert(std::is_trivially_copyable_v<T>, | ||
| "load_unaligned requires a trivially-copyable T"); | ||
| T value; | ||
| std::memcpy(&value, ptr, sizeof(T)); | ||
| return value; | ||
| } |
There was a problem hiding this comment.
Same alignment concern as #discussion_r3161710275, but on the existing SQ8_FP32 / SQ8_SQ8 kernels — they read FP32 metadata via a direct reinterpret_cast<const float*> and have the same misalignment for odd dim. Can we either apply load_unaligned there too in this PR, or open a follow-up so we don't end up with two conventions for the same problem in adjacent functions?
There was a problem hiding this comment.
You're right.
This is not in the scope of this task - I opened a ticket for this:
https://redislabs.atlassian.net/browse/MOD-15303

Describe the changes in the pull request
Adds asymmetric SQ8-to-FP16 distance support (scalar path only) for L2, IP, and Cosine, where stored vectors are SQ8-quantized (uint8 + FP32 metadata) and queries are FP16. This is the first step (P1a) of MOD-15141; SIMD kernels (P1b/P1c) and
QuantPreprocessorFP16 input support (P1a-2) are intentionally out of scope and tracked as follow-ups.The scalar kernels use the algebraic identity
dist = ||x||² + ||y||² - 2 * IP(x, y)with precomputed FP32 sums-of-squares stored alongside both the SQ8 storage and FP16 query, which keeps the quantization-error path consistent between the kernel and the test baseline. Each FP16 query element is widened to FP32 during accumulation; SQ8 metadata (min,delta,x_sum,x_sum_squares) and FP16 query metadata (y_sum,y_sum_squares) are read as FP32.The new
GetDistFunc<sq8, float, float16>specialization routes throughIP_SQ8_FP16_GetDistFunc,Cosine_SQ8_FP16_GetDistFunc, andL2_SQ8_FP16_GetDistFunc. These dispatchers currently always return the scalar implementation; SIMD chooser slots are reserved for P1b (MOD-15152) and P1c (MOD-15153).Which issues this PR fixes
Main objects this PR modified
src/VecSim/spaces/IP/IP.{h,cpp}— addsSQ8_FP16_InnerProduct_Impl,SQ8_FP16_InnerProduct, andSQ8_FP16_Cosine.src/VecSim/spaces/L2/L2.{h,cpp}— addsSQ8_FP16_L2Sqrusing the algebraic identity over the precomputed sums-of-squares.src/VecSim/spaces/IP_space.{h,cpp}— addsIP_SQ8_FP16_GetDistFuncandCosine_SQ8_FP16_GetDistFuncdispatchers.src/VecSim/spaces/L2_space.{h,cpp}— addsL2_SQ8_FP16_GetDistFuncdispatcher.src/VecSim/spaces/spaces.cpp— adds theGetDistFunc<vecsim_types::sq8, float, vecsim_types::float16>specialization wiring L2/IP/Cosine to the new dispatchers.tests/utils/tests_utils.h— addspopulate_sq8_fp16_queryandSQ8_FP16_NotOptimized_{InnerProduct,Cosine,L2Sqr}baselines (the L2 baseline mirrors the kernel's algebraic identity to avoid quantization-error drift at highdim).tests/unit/test_spaces.cpp— addsSQ8_FP16_NoOptimizationSpacesTestparameterized over 19 dimensions (including odd and SIMD-boundary residues: 1, 5, 7, 8, 9, 15, 16, 17, 31, 32, 33, 47, 48, 49, 63, 64, 65, 127, 128) for L2/IP/Cosine, plusSQ8_FP16_EdgeCases(zero query, constant storage, mixed-sign),SpacesTest.GetDistFuncSQ8FP16Asymmetric, andSpacesTest.GetDistFuncInvalidMetricSQ8ToFP16.tests/benchmark/spaces_benchmarks/bm_spaces_sq8_fp16.cpp(new) +tests/benchmark/CMakeLists.txt— registers naive scalar benchmarks for L2/IP/Cosine; SIMD benchmark blocks will be added by P1b/P1c when the chooser symbols become available.Test results
test_spacestests pass locally.bm_spaces_sq8_fp16builds and runs successfully.Follow-ups (separate stories, out of scope here)
QuantPreprocessorFP16 input support (no plumbing into the index types yet).Mark if applicable
Note
Medium Risk
Touches core distance computation and dispatch for a new datatype combination; primary risk is subtle correctness/UB issues around mixed-type layouts and unaligned metadata reads, mitigated by added
load_unalignedand broad test coverage.Overview
Adds asymmetric SQ8 storage + FP16 query support for
IP,Cosine, andL2by introducing new scalar kernels (SQ8_FP16_InnerProduct_Impl/SQ8_FP16_InnerProduct/SQ8_FP16_CosineandSQ8_FP16_L2Sqr) that widen FP16 elements to FP32 and use precomputed SQ8/query metadata.Wires the new mode into the public dispatch path via
GetDistFunc<sq8, float, float16>and new*_SQ8_FP16_GetDistFunchelpers (currently always returning scalar implementations), and addsload_unalignedto safely read trailing FP32 metadata for odd dimensions.Extends validation and performance tooling with a new
bm_spaces_sq8_fp16benchmark target plus unit tests (including odd-dimension unaligned-metadata and edge cases) and new SQ8-FP16 baseline/query-population utilities intests_utils.h.Reviewed by Cursor Bugbot for commit 304450d. Bugbot is set up for automated code reviews on this repo. Configure here.