Skip to content

feat(rust): add scalar quantizer bindings#2234

Open
jamie8johnson wants to merge 3 commits into
rapidsai:mainfrom
jamie8johnson:rust-quantize-scalar
Open

feat(rust): add scalar quantizer bindings#2234
jamie8johnson wants to merge 3 commits into
rapidsai:mainfrom
jamie8johnson:rust-quantize-scalar

Conversation

@jamie8johnson

Copy link
Copy Markdown
Contributor

Add Rust bindings for the scalar quantizer

What

This PR adds idiomatic Rust bindings for the scalar quantizer preprocessing
API (cuvsScalarQuantizer*) exposed by
c/include/cuvs/preprocessing/quantize/scalar.h.

It introduces a new preprocessing module tree under rust/cuvs/src/ and wraps
the full scalar quantizer lifecycle:

  • ScalarQuantizerParams — RAII wrapper around cuvsScalarQuantizerParams_t
    with a set_quantile builder method and a balanced Drop
    (cuvsScalarQuantizerParamsCreate / Destroy).
  • Quantizer — RAII wrapper around cuvsScalarQuantizer_t with a balanced
    Drop (cuvsScalarQuantizerCreate / Destroy) and the three operations:
    • Quantizer::train — trains the quantizer on an f32 dataset
      (cuvsScalarQuantizerTrain).
    • Quantizer::transform — quantizes an f32 matrix into i8
      (cuvsScalarQuantizerTransform).
    • Quantizer::inverse_transform — reconstructs an approximate f32 matrix
      from i8 quantized data (cuvsScalarQuantizerInverseTransform).

A supporting IntoDtype for i8 implementation was added to dlpack.rs so that
i8 quantized tensors can be passed through ManagedTensor.

The bindings follow the existing sibling module conventions exactly:
MaybeUninit + assume_init only after a successful check_cuvs, no
Copy/Clone on the handle wrappers, balanced create/destroy in Drop, and a
custom Debug impl for the params type. A runnable doc example lives in
preprocessing/quantize/mod.rs.

Notes for reviewers

  • Establishes the preprocessing module tree. This PR creates
    preprocessing/mod.rs and preprocessing/quantize/mod.rs with only the
    scalar path wired up. The binary and product-quantizer (PQ) C APIs
    (quantize/binary.h, quantize/pq.h) are intentionally out of scope for
    this PR and are left for follow-up contributions; the module tree is laid out
    so they slot in additively next to scalar.

  • bindings.rs is additive / already covers scalar. The
    cuvsScalarQuantizer* symbols are already present in
    rust/cuvs-sys/src/bindings.rs (they are reachable from core/all.h, which
    already #includes cuvs/preprocessing/quantize/scalar.h), so no change to
    all.h or the bindgen wrapper was required and no symbols were removed. The
    Rust wrapper in this PR consumes the existing generated FFI declarations.
    Regenerating the bindings against an unrelated/newer local dlpack.h would
    introduce unrelated DLPack ABI doc/struct churn, so the checked-in bindings
    were deliberately left untouched.

  • Sibling-PR conflict note. This PR only adds new files plus three small
    additive lines (pub mod preprocessing; in lib.rs and the i8 dtype impl
    in dlpack.rs). Other in-flight binding PRs that add new pub mod lines to
    lib.rs or new IntoDtype impls to dlpack.rs may touch the same hunks; if
    conflicts arise they are trivial textual conflicts (independent additions),
    resolvable by keeping both sides.

Testing

Built and tested against conda libcuvs with CUDA_VISIBLE_DEVICES=1 and
--test-threads=1. New tests in preprocessing/quantize/scalar.rs:

  • test_scalar_quantizer_params — verifies the set_quantile builder updates
    the underlying C struct.
  • test_scalar_quantizer_roundtrip — trains on a random f32 dataset
    (1024×16, values in [0, 10)), quantizes to i8, then
    inverse-transforms back to f32 and asserts the reconstruction error is
    within quantization tolerance. The quantized values are also asserted to span
    most of the i8 range (confirming the transform is non-trivial). Observed max
    absolute reconstruction error: ~0.0196 on a data range of 10.0
    (≈0.2% of range, ≈half a quantization step of 10/256 ≈ 0.039), well below
    the loose epsilon (data_range / 50 = 0.2) and far below 5% × data_range.
  • test_train_unsupported_dtype_errors — exercises the C API error path by
    training on an integer dataset, which the C API rejects. (A freshly created,
    untrained quantizer has min_ == max_ == 0, which produces degenerate output
    but is not surfaced as an error by the C API, so the dtype guard is used to
    cover the error path instead.)

The preprocessing/quantize/mod.rs doctest also runs end-to-end (train →
transform → inverse_transform).

cargo fmt is clean and cargo clippy is clean for all new code. (Two
pre-existing clippy::not_unsafe_ptr_arg_deref findings in resources.rs are
surfaced only by newer local toolchains and are unrelated to this change.)

Add idiomatic Rust bindings for the scalar quantizer preprocessing API
(cuvsScalarQuantizer*). Introduces a new preprocessing module tree under
rust/cuvs/src/ with only the scalar path wired up; binary and PQ quantizers
are intentionally left for follow-up contributions.

Wraps the full lifecycle with RAII handle types and balanced Drop:
- ScalarQuantizerParams (Create/Destroy, set_quantile builder)
- Quantizer (Create/Destroy) with train, transform, inverse_transform

Adds an IntoDtype impl for i8 in dlpack.rs so int8 quantized tensors can be
passed through ManagedTensor. The cuvsScalarQuantizer* FFI symbols are already
present in the checked-in bindings (reachable via core/all.h), so bindings.rs
is unchanged and no all.h edit was required.

Tests (CUDA_VISIBLE_DEVICES=1, single-threaded): params setter, a train ->
transform -> inverse_transform roundtrip asserting reconstruction within
quantization tolerance (observed max abs error ~0.0196 on a data range of 10),
and an unsupported-dtype error path. cargo fmt and clippy clean for new code.
@copy-pr-bot

copy-pr-bot Bot commented Jun 10, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9bb8f75a-cf38-4999-b6c0-bd1c2da3e9a8

📥 Commits

Reviewing files that changed from the base of the PR and between 32a0d3e and 5ffe917.

📒 Files selected for processing (1)
  • rust/cuvs/src/preprocessing/quantize/scalar.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • rust/cuvs/src/preprocessing/quantize/scalar.rs

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added a public preprocessing module with scalar quantization: train on float datasets, transform to 8‑bit integer representations, and inverse-transform back to approximate floats.
    • Configurable quantile parameter for outlier clipping.
    • Runtime validation enforcing 8‑bit integer tensors for transform/inverse operations.
  • Tests

    • Unit tests for dtype validation and train→transform→inverse roundtrip.

Walkthrough

This PR adds a preprocessing::quantize::scalar Rust FFI wrapper: wires module exports, adds i8 DLPack support, implements ScalarQuantizerParams and Quantizer (train/transform/inverse_transform) with Rust-side i8 dtype guards, and extends unit tests for dtype validation and roundtrip behavior.

Changes

Scalar Quantization Module

Layer / File(s) Summary
Module structure and dtype support
rust/cuvs/src/lib.rs, rust/cuvs/src/preprocessing/mod.rs, rust/cuvs/src/preprocessing/quantize/mod.rs, rust/cuvs/src/dlpack.rs
Module hierarchy added and IntoDtype implemented for i8 to enable int8 DLPack tensors.
Scalar quantizer types and methods
rust/cuvs/src/preprocessing/quantize/scalar.rs
Adds ScalarQuantizerParams (new, set_quantile, Drop) and Quantizer (train, transform, inverse_transform, Drop) with Rust i8 dtype guards; includes unit tests for params, roundtrip reconstruction, and dtype-guard error cases.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(rust): add scalar quantizer bindings' directly and clearly describes the main change: adding Rust bindings for the scalar quantizer, which aligns with the core objective of the PR.
Description check ✅ Passed The description provides comprehensive details about the PR including what was added (ScalarQuantizerParams, Quantizer wrappers), how it works, notes for reviewers about scope and module structure, and testing details - all directly related to the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
rust/cuvs/src/preprocessing/quantize/scalar.rs (1)

38-43: ⚡ Quick win

Consider validating the quantile range.

The documentation states that quantile must be within (0, 1], but the method doesn't validate this constraint. While the C API may perform its own validation, checking the range here would provide immediate feedback to users.

🛡️ Suggested validation
 pub fn set_quantile(self, quantile: f32) -> ScalarQuantizerParams {
+    assert!(quantile > 0.0 && quantile <= 1.0, "quantile must be in range (0, 1]");
     unsafe {
         (*self.0).quantile = quantile;
     }

Alternatively, for a more ergonomic API, return Result<Self> to allow callers to handle the error gracefully.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cuvs/src/preprocessing/quantize/scalar.rs` around lines 38 - 43, The
set_quantile method on ScalarQuantizerParams currently accepts any f32; add a
runtime check that the provided quantile is > 0.0 and <= 1.0 inside set_quantile
(before writing to (*self.0).quantile) and return an error on invalid
input—either change the signature to return Result<ScalarQuantizerParams,
QuantileRangeError> (preferred for ergonomics) or explicitly panic with a clear
message; reference the set_quantile method and ScalarQuantizerParams when making
this change so callers get immediate, validated feedback.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@rust/cuvs/src/preprocessing/quantize/scalar.rs`:
- Around line 53-60: The Drop implementation for ScalarQuantizerParams may
double-panic because write!(stderr(), ...) is followed by expect(...); update
the drop method to avoid panicking on write failure by ignoring the write result
(e.g., use let _ = write!(stderr(), "failed to call
cuvsScalarQuantizerParamsDestroy {:?}", e)) so that errors writing to stderr
cannot cause a panic during unwinding; locate the impl Drop for
ScalarQuantizerParams and replace the expect(...) usage after write!(stderr(),
...) accordingly.
- Around line 114-128: The transform() wrapper must validate that the output
ManagedTensor has dtype i8 before calling ffi::cuvsScalarQuantizerTransform to
avoid UB; update scalar.rs::transform to read the dtype from the out
ManagedTensor, return an Err (or a Result::Err with a clear message) if dtype is
not i8, and only call check_cuvs(ffi::cuvsScalarQuantizerTransform(...)) when
the dtype check passes; reference Resources, ManagedTensor, transform,
check_cuvs, and ffi::cuvsScalarQuantizerTransform when locating where to add the
validation.
- Around line 158-165: In the Drop implementation for Quantizer (impl Drop for
Quantizer) avoid a potential double-panic by removing the .expect on the write!
call; when check_cuvs(unsafe { ffi::cuvsScalarQuantizerDestroy(self.0) })
returns Err(e) write the error to stderr using let _ = write!(stderr(), "failed
to call cuvsScalarQuantizerDestroy {:?}", e) (or otherwise ignore the Result)
instead of calling .expect so write failures during unwinding are silenced.

---

Nitpick comments:
In `@rust/cuvs/src/preprocessing/quantize/scalar.rs`:
- Around line 38-43: The set_quantile method on ScalarQuantizerParams currently
accepts any f32; add a runtime check that the provided quantile is > 0.0 and <=
1.0 inside set_quantile (before writing to (*self.0).quantile) and return an
error on invalid input—either change the signature to return
Result<ScalarQuantizerParams, QuantileRangeError> (preferred for ergonomics) or
explicitly panic with a clear message; reference the set_quantile method and
ScalarQuantizerParams when making this change so callers get immediate,
validated feedback.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e50d251e-48f7-4d1d-8088-a8042f025fb3

📥 Commits

Reviewing files that changed from the base of the PR and between 78135be and 54351f8.

📒 Files selected for processing (5)
  • rust/cuvs/src/dlpack.rs
  • rust/cuvs/src/lib.rs
  • rust/cuvs/src/preprocessing/mod.rs
  • rust/cuvs/src/preprocessing/quantize/mod.rs
  • rust/cuvs/src/preprocessing/quantize/scalar.rs

Comment thread rust/cuvs/src/preprocessing/quantize/scalar.rs
Comment thread rust/cuvs/src/preprocessing/quantize/scalar.rs
Comment thread rust/cuvs/src/preprocessing/quantize/scalar.rs
…rm, non-panicking Drop logging

The C API reinterprets i8 buffers without dtype validation; guard both
the transform output and the inverse_transform input Rust-side so a
wrong-dtype tensor surfaces as InvalidArgument instead of memory
corruption. Drop logging switched from .expect to best-effort write to
avoid a double-panic during unwinding (sibling modules share the old
pattern; happy to sweep them in a follow-up).
@jamie8johnson

Copy link
Copy Markdown
Contributor Author

On the quantile-range nitpick (set_quantile, range (0, 1]): leaving it unvalidated deliberately — the C API doesn't enforce the range either, and the crate's builder setters (cagra, ivf_*) uniformly pass values through without validation, so a lone guard here would diverge from the established pattern. The docstring states the valid range. Happy to add a check (or a crate-wide setter-validation sweep as a follow-up) if maintainers prefer.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@rust/cuvs/src/preprocessing/quantize/scalar.rs`:
- Around line 298-303: Update the two assertions to verify the specific guard
error instead of any error: call quantizer.transform(...) and
quantizer.inverse_transform(...) and assert that the returned Err matches
Error::InvalidArgument (and check the argument label text that identifies the
bad tensor/dtype), e.g. by pattern-matching the Result to ensure the error kind
is InvalidArgument and the error message contains the expected label for the
non-i8 tensor; do this for both the transform and inverse_transform checks
(referencing quantizer.transform and quantizer.inverse_transform).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 24b52c4a-efc6-44de-a86b-39e4e1c439a9

📥 Commits

Reviewing files that changed from the base of the PR and between 54351f8 and 32a0d3e.

📒 Files selected for processing (1)
  • rust/cuvs/src/preprocessing/quantize/scalar.rs

Comment thread rust/cuvs/src/preprocessing/quantize/scalar.rs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant