Skip to content

[None][fix] Fix config sharing issue for Qwen3-VL#14766

Open
2ez4bz wants to merge 1 commit into
NVIDIA:mainfrom
2ez4bz:dev-qwen-config-fix
Open

[None][fix] Fix config sharing issue for Qwen3-VL#14766
2ez4bz wants to merge 1 commit into
NVIDIA:mainfrom
2ez4bz:dev-qwen-config-fix

Conversation

@2ez4bz
Copy link
Copy Markdown
Collaborator

@2ez4bz 2ez4bz commented May 29, 2026

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Fixed configuration state mutation in multimodal model initialization where the vision encoder would share configuration mutations with the primary model instance.
  • Tests

    • Added test to verify that model initialization preserves caller-provided configuration objects and their properties.

Review Change Stack

Description

This commit fixes an issue where the KV cache config of
the Qwen3 VLM's end model is overwritten by changes
made to it in its vision model. Amongst other things, this
affected the KV cache dtype used by the KV cache manager.

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
@2ez4bz 2ez4bz requested review from a team as code owners May 29, 2026 23:24
@2ez4bz
Copy link
Copy Markdown
Collaborator Author

2ez4bz commented May 29, 2026

/bot run

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

📝 Walkthrough

Walkthrough

This PR isolates the Qwen3VL vision encoder's configuration by deep-copying the model config during initialization, preventing the encoder from mutating the caller's quantization settings. A test validates that the caller's QuantConfig object and its properties remain unchanged.

Changes

Vision Encoder Config Isolation

Layer / File(s) Summary
Vision encoder deep copy initialization
tensorrt_llm/_torch/models/modeling_qwen3vl.py
Qwen3VisionModelBase initialization now receives copy.deepcopy(model_config) instead of the original model_config, preventing configuration mutations from affecting the caller's state.
Quantization config preservation test
tests/unittest/_torch/modeling/test_modeling_qwen3vl.py, tests/integration/test_lists/test-db/l0_l40s.yml
Imports quantization types, adds test_qwen3vl_init_preserves_caller_quant_config to assert the caller's QuantConfig object identity and kv_cache_quant_algo remain unchanged while the vision encoder uses an independent copy with quantization disabled, and registers the test in the L40S pre-merge suite.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • NVIDIA/TensorRT-LLM#12851: Both PRs modify Qwen3VLModelBase.__init__ in modeling_qwen3vl.py to prevent the Qwen3-VL vision encoder from inheriting/mutating the caller's quantization (quant_config/KV quant algo) state.

Suggested reviewers

  • moraxu
  • yechank-nvidia
  • StanleySun639
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: fixing a config sharing issue for Qwen3-VL, which directly relates to the changeset's core fix of preventing shared mutations between model config and vision encoder.
Description check ✅ Passed The description explains the issue (KV cache config overwrite affecting dtype) and mentions it's fixed, but does not explicitly list which tests safeguard this change despite the template requiring it.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tensorrt_llm/_torch/models/modeling_qwen3vl.py (1)

1061-1064: ⚡ Quick win

Fix correctly isolates the vision encoder config.

copy.deepcopy(model_config) at lines 1061-1064 prevents Qwen3VisionModelBase.__init__ from clobbering the caller’s quant_config via its in-place assignment (self.model_config.quant_config = QuantConfig() around line 881). Repo search for Qwen3VisionModelBase( shows this is the only construction site, so the defensive copy won’t miss other callers.

Optional: for stronger misuse-proofing, encapsulate the “reset QuantConfig for the vision encoder” inside Qwen3VisionModelBase itself rather than relying on each call site to deep-copy.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tensorrt_llm/_torch/models/modeling_qwen3vl.py` around lines 1061 - 1064, The
vision encoder construction must not clobber the caller's
model_config.quant_config; ensure Qwen3VisionModelBase is instantiated with an
isolated config by passing a deep copy of model_config (e.g.,
copy.deepcopy(model_config)) when creating mm_encoder inside the _is_disagg()
branch, and keep the existing conditional use of
kwargs.get("vision_model_class", None); alternatively (recommended for
robustness) move the quant_config reset into Qwen3VisionModelBase.__init__ so
the class itself does self.model_config.quant_config = QuantConfig() on its own
copy to prevent any caller-side mutation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tensorrt_llm/_torch/models/modeling_qwen3vl.py`:
- Around line 1061-1064: The vision encoder construction must not clobber the
caller's model_config.quant_config; ensure Qwen3VisionModelBase is instantiated
with an isolated config by passing a deep copy of model_config (e.g.,
copy.deepcopy(model_config)) when creating mm_encoder inside the _is_disagg()
branch, and keep the existing conditional use of
kwargs.get("vision_model_class", None); alternatively (recommended for
robustness) move the quant_config reset into Qwen3VisionModelBase.__init__ so
the class itself does self.model_config.quant_config = QuantConfig() on its own
copy to prevent any caller-side mutation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d626ebfb-4e69-46ab-ad88-aeb0631f39f2

📥 Commits

Reviewing files that changed from the base of the PR and between 74d7c3a and dc638eb.

📒 Files selected for processing (3)
  • tensorrt_llm/_torch/models/modeling_qwen3vl.py
  • tests/integration/test_lists/test-db/l0_l40s.yml
  • tests/unittest/_torch/modeling/test_modeling_qwen3vl.py

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #51124 [ run ] triggered by Bot. Commit: dc638eb Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #51124 [ run ] completed with state SUCCESS. Commit: dc638eb
/LLM/main/L0_MergeRequest_PR pipeline #40562 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@2ez4bz
Copy link
Copy Markdown
Collaborator Author

2ez4bz commented May 30, 2026

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #51166 [ run ] triggered by Bot. Commit: dc638eb Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #51166 [ run ] completed with state SUCCESS. Commit: dc638eb
/LLM/main/L0_MergeRequest_PR pipeline #40599 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants