-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[None][fix] Fix 'max_batch_size' conflict in AD dashboard script
#12967
opened Apr 12, 2026 by
tcherckez-nvidia
Collaborator
Loading…
1 task done
[None][feat] Fuse GDN elementwise ops and split/transpose kernels
#12966
opened Apr 12, 2026 by
Wong4j
Collaborator
Loading…
1 task done
[https://nvbugs/6050489][fix] JUST TEST, DO NOW REVIEW
#12964
opened Apr 12, 2026 by
bo-nv
Collaborator
Loading…
1 task
[None][feat] AutoDeploy: Onboard MiniMaxAI/MiniMax-M2.7 custom model
#12963
opened Apr 12, 2026 by
suyoggupta
Collaborator
Loading…
3 tasks done
[None][perf] Optimize EAGLE3 dynamic tree: position offsets stride, packed mask layout, slot storage
#12962
opened Apr 12, 2026 by
sunnyqgg
Collaborator
Loading…
4 tasks done
[https://nvbugs/6026676][fix] Only waive the tests for H20 so that H100 still covered
#12961
opened Apr 12, 2026 by
dongfengy
Collaborator
Loading…
1 task done
[None][feat] Remove LoRA weights from request broadcast
#12959
opened Apr 12, 2026 by
achartier
Collaborator
Loading…
1 task done
[None][feat] Add SM100f custom mask FMHA cubins for KeepsAbForGen (bf16/fp16 update + FP8 E4M3/E2M1)
#12958
opened Apr 12, 2026 by
sunnyqgg
Collaborator
Loading…
2 tasks done
[None][fix] Fix GIL race in hostfunc nanobind bindings causing intermittent segfault
Community want to contribute
PRs initiated from Community
#12957
opened Apr 12, 2026 by
ssam18
Contributor
Loading…
[None][fix] Fall back to local cache when loading tokenizer for gated models
Community want to contribute
PRs initiated from Community
#12956
opened Apr 12, 2026 by
ssam18
Contributor
Loading…
[https://nvbugs/5973199][fix] Unwaiving accuracy tests
#12952
opened Apr 11, 2026 by
greg-kwasniewski1
Collaborator
Loading…
1 task done
[https://nvbugs/5838178][fix] Fix failing lora test for Llama
#12950
opened Apr 11, 2026 by
brb-nv
Collaborator
Loading…
1 task done
[#12784][feat] AutoDeploy: Fuse QKV projections & Use DeepGemm
#12946
opened Apr 11, 2026 by
taylor-yb-lee
Collaborator
•
Draft
1 task
[https://nvbugs/6066969][fix] Store context blocks before request termination
#12945
opened Apr 11, 2026 by
Tabrizian
Member
Loading…
1 task done
[None][fix] Fix chunked prefill crash for VLMs with non-contiguous multimodal tokens
#12944
opened Apr 11, 2026 by
venkywonka
Collaborator
•
Draft
2 of 3 tasks
[None][feat] Add Attention2D sequence parallelism for visual-gen models
VisualGen
#12943
opened Apr 11, 2026 by
venmugil
Loading…
1 task done
[https://nvbugs/6059036][fix] AutoDeploy fix registry accuracy tests
#12942
opened Apr 10, 2026 by
nvchenghaoz
Collaborator
Loading…
Draft: [None][fix] fix to remove unwanted warnings about store context blocks
#12941
opened Apr 10, 2026 by
pcastonguay
Collaborator
•
Draft
1 task
[TRTLLM-11485][feat] Feature rework: Add SageAttention refreshed kernels (attentionOp only)
#12937
opened Apr 10, 2026 by
xrq-phys
Collaborator
Loading…
1 task done
[None][infra] Waive 8 failed cases for main in post-merge 2646
#12934
opened Apr 10, 2026 by
ZhanruiSunCh
Collaborator
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.