Skip to content

feat: Add lightonai/LateOn model#643

Open
tekumara wants to merge 6 commits into
qdrant:mainfrom
tekumara:add-lateon-support
Open

feat: Add lightonai/LateOn model#643
tekumara wants to merge 6 commits into
qdrant:mainfrom
tekumara:add-lateon-support

Conversation

@tekumara
Copy link
Copy Markdown

@tekumara tekumara commented Jun 1, 2026

Beats every existing ColBERT model, including those 4× its size (Jina ColBERT v2 at 559M, Arctic Embed L v2 at 568M).

Resolves #641

See these commits (which have been reverted so as not to pollute the codebase):

All Submissions:

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  • Does your submission pass the existing tests?
  • Have you added tests for your feature?
  • Have you installed pre-commit with pip3 install pre-commit and set up hooks with pre-commit install?

New models submission:

  • Have you added an explanation of why it's important to include this model?
  • Have you added tests for the new model? Were canonical values for tests computed via the original model?
  • Have you added the code snippet for how canonical values were computed?
  • Have you successfully ran tests with your changes locally?

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds support for the LateOn late-interaction embedding model from lightonai. The change introduces a new LateOn class that extends Colbert, customizes ONNX model loading with LateOn-specific tokenizer configuration, overrides post-processing to normalize query embeddings, implements token counting for query text, and registers the model in the embeddings registry. Test cases validate the model's canonical embeddings and 128-dimensional output size.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: Add lightonai/LateOn model' accurately and concisely summarizes the main change: adding support for a new late-interaction embedding model.
Description check ✅ Passed The description provides context about LateOn's advantages, references the resolved issue #641, includes verification of test compliance, and documents the methodology for canonical values computation.
Linked Issues check ✅ Passed The PR fulfills issue #641 by implementing support for the lightonai/LateOn model with proper integration into the late-interaction embedding framework, including model registration, ONNX loading, and test coverage.
Out of Scope Changes check ✅ Passed All changes are directly related to adding LateOn support: new LateOn implementation, registry update, and test additions with canonical values for the new model.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
fastembed/late_interaction/lateon.py (1)

81-81: ⚡ Quick win

Consider adding strict=True to the zip call to catch batch-size mismatches.

Line 81 assumes output.model_output and output.attention_mask have the same length. Since the project requires Python >=3.10.0, adding strict=True is compatible and will fail fast on mismatches.

🔒 Proposed improvement
-        for embedding, attention_mask in zip(output.model_output, output.attention_mask):
+        for embedding, attention_mask in zip(output.model_output, output.attention_mask, strict=True):
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fastembed/late_interaction/lateon.py` at line 81, The loop zipping
output.model_output and output.attention_mask should fail fast on length
mismatches; update the zip call in lateon.py (the line iterating "for embedding,
attention_mask in zip(output.model_output, output.attention_mask):") to use
zip(..., strict=True) so Python raises an error if batch sizes differ, ensuring
mismatched output.model_output and output.attention_mask are caught immediately.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@fastembed/late_interaction/lateon.py`:
- Line 81: The loop zipping output.model_output and output.attention_mask should
fail fast on length mismatches; update the zip call in lateon.py (the line
iterating "for embedding, attention_mask in zip(output.model_output,
output.attention_mask):") to use zip(..., strict=True) so Python raises an error
if batch sizes differ, ensuring mismatched output.model_output and
output.attention_mask are caught immediately.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3e53e071-1129-4bd3-aa93-66bb7034bf10

📥 Commits

Reviewing files that changed from the base of the PR and between 8a8ea4f and 02f9145.

📒 Files selected for processing (3)
  • fastembed/late_interaction/late_interaction_text_embedding.py
  • fastembed/late_interaction/lateon.py
  • tests/test_late_interaction_embeddings.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Model]: LateOn

1 participant