-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[TRTLLM-12288][feat] Support Nemotron-H nvfp4 ckpt on Hopper
#14775
opened May 30, 2026 by
JadoTu
Collaborator
Loading…
1 task done
[https://nvbugs/6240561][fix] Autodeploy fix the deepseek accuracy drop
#14774
opened May 30, 2026 by
nvchenghaoz
Collaborator
Loading…
1 task done
[None][feat] Wan 2.2: fuse NVFP4 quantization with preceding LayerNorm/AdaLN and GELU-tanh
#14773
opened May 30, 2026 by
anikaj-eng
Collaborator
Loading…
[None][test] Waive 1 failed cases for main in QA CI
#14771
opened May 30, 2026 by
tensorrt-cicd
Collaborator
•
Draft
[TRTLLM-13077][feat] Deocmpose post_load_weights()
#14770
opened May 30, 2026 by
chienchunhung
Collaborator
•
Draft
1 task done
[https://nvbugs/6104831][fix] Enforce request and buffer index lifecycle integrity
#14768
opened May 30, 2026 by
chienchunhung
Collaborator
•
Draft
[None][fix] Fix config sharing issue for Qwen3-VL
#14766
opened May 29, 2026 by
2ez4bz
Collaborator
Loading…
1 task done
[TRTLLM-12507][feat] Add MoE LoRA layout and validation helpers
#14764
opened May 29, 2026 by
brb-nv
Collaborator
Loading…
1 task done
[None][infra] Generate json with cmake fetched contents in build stag…
#14761
opened May 29, 2026 by
yuanjingx87
Collaborator
Loading…
1 task
[None][chore] increase test shards
#14760
opened May 29, 2026 by
tburt-nv
Collaborator
Loading…
1 task done
[None][feat] Add AutoDeploy support for StepFun Step-3.7-Flash
#14759
opened May 29, 2026 by
bmarimuthu-nv
Collaborator
Loading…
qwen: Add support for Qwen3-Embedding models in encode_only mode
#14758
opened May 29, 2026 by
Priyanshu31102003
Loading…
[https://nvbugs/6165866][infra] Waive 1 failed cases for main in pre-merge 40081 - Fix prefix
#14756
opened May 29, 2026 by
taylor-yb-lee
Collaborator
Loading…
1 task done
[None][test] Waive 2 failed cases for main in QA CI
#14755
opened May 29, 2026 by
tensorrt-cicd
Collaborator
•
Draft
[None][test] Waive 1 failed cases for main in QA CI
#14753
opened May 29, 2026 by
tensorrt-cicd
Collaborator
•
Draft
[None][test] Waive 1 failed cases for main in QA CI
#14752
opened May 29, 2026 by
tensorrt-cicd
Collaborator
•
Draft
[None][feat] Support DeepSeek-V4 model
api-compatible
Accepted LLM API contract change that is backwards-compatible
[None][fix] Pipe stderr separately in subprocess calls to improve error reporting in Allure (#14750)
#14750
opened May 29, 2026 by
yufeiwu-nv
Collaborator
Loading…
1 task done
[None][test] Remove duplicate test cases in llm_perf_core file
#14749
opened May 29, 2026 by
yufeiwu-nv
Collaborator
Loading…
1 task done
[None][feat] Add KV cache prefetch
api-compatible
Accepted LLM API contract change that is backwards-compatible
#14748
opened May 29, 2026 by
lowsfer
Member
Loading…
1 task done
[None][test] Waive 3 failed cases for main in QA CI
#14747
opened May 29, 2026 by
tensorrt-cicd
Collaborator
•
Draft
[None][fix] Make disagg timeout cancellation rank-consistent
deepseek-v4
#14746
opened May 29, 2026 by
Shixiaowei02
Collaborator
•
Draft
1 task done
[TRTLLM-12669][refactor] Remove allow_advanced_sampling and capture dual CUDA graphs
#14745
opened May 29, 2026 by
zhaoyangwang-nvidia
Collaborator
Loading…
1 task done
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.