Skip to content

fix: support quantization ranges for int8/uint8 sentence-transformers…#3543

Open
DebadityaHait wants to merge 2 commits into
deepset-ai:mainfrom
DebadityaHait:fix-st-int8-quantization-ranges
Open

fix: support quantization ranges for int8/uint8 sentence-transformers…#3543
DebadityaHait wants to merge 2 commits into
deepset-ai:mainfrom
DebadityaHait:fix-st-int8-quantization-ranges

Conversation

@DebadityaHait

Copy link
Copy Markdown

Related Issues

Proposed Changes:

With precision="int8" or "uint8", sentence-transformers calibrates the scalar-quantization min/max range from the batch being encoded. For a single text (as in SentenceTransformersTextEmbedder.run) the range is degenerate (min == max), producing meaningless all-equal embeddings (e.g. all zeros).

  • SentenceTransformersTextEmbedder and SentenceTransformersDocumentEmbedder accept a new optional quantization_ranges init parameter — explicit calibration ranges with shape (2, embedding_dim) (mins in first row, maxs in second).
  • _SentenceTransformersEmbeddingBackend.embed encodes in float32 and forwards the ranges to sentence_transformers.quantize_embeddings when a quantized precision is used with explicit ranges; behavior is unchanged when quantization_ranges is None.
  • A warning is logged when int8/uint8 is used without ranges, pointing users at the parameter.
  • quantization_ranges is included in to_dict/from_dict serialization.

How did you test it?

  • New unit tests: init/serialization, kwarg forwarding from both embedders, backend quantization path (with and without ranges), warning emission.
  • New integration test: real tiny model (sentence-transformers-testing/stsb-bert-tiny-safetensors) with int8 + ranges produces a non-degenerate integer embedding.
  • hatch run test:unit (162 passed), hatch run test:integration -k quantization (2 passed), hatch run test:types and hatch run fmt clean.

Notes for the reviewer

Checklist

@DebadityaHait DebadityaHait requested a review from a team as a code owner July 3, 2026 07:43
@DebadityaHait DebadityaHait requested review from julian-risch and removed request for a team July 3, 2026 07:43
@CLAassistant

CLAassistant commented Jul 3, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Heads-up for maintainers

This PR is from a fork and touches integrations whose integration tests require API keys.
Those tests are skipped in CI because fork PRs don't have access to repo secrets for security reasons.

Affected integrations:

  • sentence_transformers

Please run the integration tests locally (hatch run test:integration inside each folder) before approving.

@github-actions github-actions Bot added the type:documentation Improvements or additions to documentation label Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration:sentence-transformers type:documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SentenceTransformersTextEmbedder zero embeddings for single text queries when using precision="int8"

2 participants