feat(rust): add scalar quantizer bindings#2234
Conversation
Add idiomatic Rust bindings for the scalar quantizer preprocessing API (cuvsScalarQuantizer*). Introduces a new preprocessing module tree under rust/cuvs/src/ with only the scalar path wired up; binary and PQ quantizers are intentionally left for follow-up contributions. Wraps the full lifecycle with RAII handle types and balanced Drop: - ScalarQuantizerParams (Create/Destroy, set_quantile builder) - Quantizer (Create/Destroy) with train, transform, inverse_transform Adds an IntoDtype impl for i8 in dlpack.rs so int8 quantized tensors can be passed through ManagedTensor. The cuvsScalarQuantizer* FFI symbols are already present in the checked-in bindings (reachable via core/all.h), so bindings.rs is unchanged and no all.h edit was required. Tests (CUDA_VISIBLE_DEVICES=1, single-threaded): params setter, a train -> transform -> inverse_transform roundtrip asserting reconstruction within quantization tolerance (observed max abs error ~0.0196 on a data range of 10), and an unsupported-dtype error path. cargo fmt and clippy clean for new code.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR adds a preprocessing::quantize::scalar Rust FFI wrapper: wires module exports, adds ChangesScalar Quantization Module
🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
rust/cuvs/src/preprocessing/quantize/scalar.rs (1)
38-43: ⚡ Quick winConsider validating the quantile range.
The documentation states that
quantilemust be within(0, 1], but the method doesn't validate this constraint. While the C API may perform its own validation, checking the range here would provide immediate feedback to users.🛡️ Suggested validation
pub fn set_quantile(self, quantile: f32) -> ScalarQuantizerParams { + assert!(quantile > 0.0 && quantile <= 1.0, "quantile must be in range (0, 1]"); unsafe { (*self.0).quantile = quantile; }Alternatively, for a more ergonomic API, return
Result<Self>to allow callers to handle the error gracefully.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rust/cuvs/src/preprocessing/quantize/scalar.rs` around lines 38 - 43, The set_quantile method on ScalarQuantizerParams currently accepts any f32; add a runtime check that the provided quantile is > 0.0 and <= 1.0 inside set_quantile (before writing to (*self.0).quantile) and return an error on invalid input—either change the signature to return Result<ScalarQuantizerParams, QuantileRangeError> (preferred for ergonomics) or explicitly panic with a clear message; reference the set_quantile method and ScalarQuantizerParams when making this change so callers get immediate, validated feedback.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@rust/cuvs/src/preprocessing/quantize/scalar.rs`:
- Around line 53-60: The Drop implementation for ScalarQuantizerParams may
double-panic because write!(stderr(), ...) is followed by expect(...); update
the drop method to avoid panicking on write failure by ignoring the write result
(e.g., use let _ = write!(stderr(), "failed to call
cuvsScalarQuantizerParamsDestroy {:?}", e)) so that errors writing to stderr
cannot cause a panic during unwinding; locate the impl Drop for
ScalarQuantizerParams and replace the expect(...) usage after write!(stderr(),
...) accordingly.
- Around line 114-128: The transform() wrapper must validate that the output
ManagedTensor has dtype i8 before calling ffi::cuvsScalarQuantizerTransform to
avoid UB; update scalar.rs::transform to read the dtype from the out
ManagedTensor, return an Err (or a Result::Err with a clear message) if dtype is
not i8, and only call check_cuvs(ffi::cuvsScalarQuantizerTransform(...)) when
the dtype check passes; reference Resources, ManagedTensor, transform,
check_cuvs, and ffi::cuvsScalarQuantizerTransform when locating where to add the
validation.
- Around line 158-165: In the Drop implementation for Quantizer (impl Drop for
Quantizer) avoid a potential double-panic by removing the .expect on the write!
call; when check_cuvs(unsafe { ffi::cuvsScalarQuantizerDestroy(self.0) })
returns Err(e) write the error to stderr using let _ = write!(stderr(), "failed
to call cuvsScalarQuantizerDestroy {:?}", e) (or otherwise ignore the Result)
instead of calling .expect so write failures during unwinding are silenced.
---
Nitpick comments:
In `@rust/cuvs/src/preprocessing/quantize/scalar.rs`:
- Around line 38-43: The set_quantile method on ScalarQuantizerParams currently
accepts any f32; add a runtime check that the provided quantile is > 0.0 and <=
1.0 inside set_quantile (before writing to (*self.0).quantile) and return an
error on invalid input—either change the signature to return
Result<ScalarQuantizerParams, QuantileRangeError> (preferred for ergonomics) or
explicitly panic with a clear message; reference the set_quantile method and
ScalarQuantizerParams when making this change so callers get immediate,
validated feedback.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e50d251e-48f7-4d1d-8088-a8042f025fb3
📒 Files selected for processing (5)
rust/cuvs/src/dlpack.rsrust/cuvs/src/lib.rsrust/cuvs/src/preprocessing/mod.rsrust/cuvs/src/preprocessing/quantize/mod.rsrust/cuvs/src/preprocessing/quantize/scalar.rs
…rm, non-panicking Drop logging The C API reinterprets i8 buffers without dtype validation; guard both the transform output and the inverse_transform input Rust-side so a wrong-dtype tensor surfaces as InvalidArgument instead of memory corruption. Drop logging switched from .expect to best-effort write to avoid a double-panic during unwinding (sibling modules share the old pattern; happy to sweep them in a follow-up).
|
On the quantile-range nitpick ( |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@rust/cuvs/src/preprocessing/quantize/scalar.rs`:
- Around line 298-303: Update the two assertions to verify the specific guard
error instead of any error: call quantizer.transform(...) and
quantizer.inverse_transform(...) and assert that the returned Err matches
Error::InvalidArgument (and check the argument label text that identifies the
bad tensor/dtype), e.g. by pattern-matching the Result to ensure the error kind
is InvalidArgument and the error message contains the expected label for the
non-i8 tensor; do this for both the transform and inverse_transform checks
(referencing quantizer.transform and quantizer.inverse_transform).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 24b52c4a-efc6-44de-a86b-39e4e1c439a9
📒 Files selected for processing (1)
rust/cuvs/src/preprocessing/quantize/scalar.rs
Add Rust bindings for the scalar quantizer
What
This PR adds idiomatic Rust bindings for the scalar quantizer preprocessing
API (
cuvsScalarQuantizer*) exposed byc/include/cuvs/preprocessing/quantize/scalar.h.It introduces a new
preprocessingmodule tree underrust/cuvs/src/and wrapsthe full scalar quantizer lifecycle:
ScalarQuantizerParams— RAII wrapper aroundcuvsScalarQuantizerParams_twith a
set_quantilebuilder method and a balancedDrop(
cuvsScalarQuantizerParamsCreate/Destroy).Quantizer— RAII wrapper aroundcuvsScalarQuantizer_twith a balancedDrop(cuvsScalarQuantizerCreate/Destroy) and the three operations:Quantizer::train— trains the quantizer on anf32dataset(
cuvsScalarQuantizerTrain).Quantizer::transform— quantizes anf32matrix intoi8(
cuvsScalarQuantizerTransform).Quantizer::inverse_transform— reconstructs an approximatef32matrixfrom
i8quantized data (cuvsScalarQuantizerInverseTransform).A supporting
IntoDtype for i8implementation was added todlpack.rsso thati8quantized tensors can be passed throughManagedTensor.The bindings follow the existing sibling module conventions exactly:
MaybeUninit+assume_initonly after a successfulcheck_cuvs, noCopy/Cloneon the handle wrappers, balanced create/destroy inDrop, and acustom
Debugimpl for the params type. A runnable doc example lives inpreprocessing/quantize/mod.rs.Notes for reviewers
Establishes the
preprocessingmodule tree. This PR createspreprocessing/mod.rsandpreprocessing/quantize/mod.rswith only thescalarpath wired up. The binary and product-quantizer (PQ) C APIs(
quantize/binary.h,quantize/pq.h) are intentionally out of scope forthis PR and are left for follow-up contributions; the module tree is laid out
so they slot in additively next to
scalar.bindings.rsis additive / already covers scalar. ThecuvsScalarQuantizer*symbols are already present inrust/cuvs-sys/src/bindings.rs(they are reachable fromcore/all.h, whichalready
#includescuvs/preprocessing/quantize/scalar.h), so no change toall.hor the bindgen wrapper was required and no symbols were removed. TheRust wrapper in this PR consumes the existing generated FFI declarations.
Regenerating the bindings against an unrelated/newer local
dlpack.hwouldintroduce unrelated DLPack ABI doc/struct churn, so the checked-in bindings
were deliberately left untouched.
Sibling-PR conflict note. This PR only adds new files plus three small
additive lines (
pub mod preprocessing;inlib.rsand thei8dtype implin
dlpack.rs). Other in-flight binding PRs that add newpub modlines tolib.rsor newIntoDtypeimpls todlpack.rsmay touch the same hunks; ifconflicts arise they are trivial textual conflicts (independent additions),
resolvable by keeping both sides.
Testing
Built and tested against conda
libcuvswithCUDA_VISIBLE_DEVICES=1and--test-threads=1. New tests inpreprocessing/quantize/scalar.rs:test_scalar_quantizer_params— verifies theset_quantilebuilder updatesthe underlying C struct.
test_scalar_quantizer_roundtrip— trains on a randomf32dataset(1024×16, values in
[0, 10)), quantizes toi8, theninverse-transforms back to
f32and asserts the reconstruction error iswithin quantization tolerance. The quantized values are also asserted to span
most of the
i8range (confirming the transform is non-trivial). Observed maxabsolute reconstruction error: ~0.0196 on a data range of 10.0
(≈0.2% of range, ≈half a quantization step of
10/256 ≈ 0.039), well belowthe loose epsilon (
data_range / 50 = 0.2) and far below5% × data_range.test_train_unsupported_dtype_errors— exercises the C API error path bytraining on an integer dataset, which the C API rejects. (A freshly created,
untrained quantizer has
min_ == max_ == 0, which produces degenerate outputbut is not surfaced as an error by the C API, so the dtype guard is used to
cover the error path instead.)
The
preprocessing/quantize/mod.rsdoctest also runs end-to-end (train →transform → inverse_transform).
cargo fmtis clean andcargo clippyis clean for all new code. (Twopre-existing
clippy::not_unsafe_ptr_arg_dereffindings inresources.rsaresurfaced only by newer local toolchains and are unrelated to this change.)