-
Notifications
You must be signed in to change notification settings - Fork 87
Pull requests: quic/efficient-transformers
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[WIP] Fix for acc issue in Qwen3 VL moe
#1010
opened May 25, 2026 by
tv-karthikeya
Contributor
Loading…
feat(skip-softmax): Add skip-softmax support for KV-blocked attention
enhancement
New feature or request
Change CCL input precision from int8 to int64 to align with compiler
#1003
opened May 22, 2026 by
vjanfaza
Contributor
Loading…
Qwen image with magcache
Diffusers
Use for PR related to diffusers in efficient-transformers.
performance
#998
opened May 20, 2026 by
quic-amitraj
Contributor
Loading…
Repeatkv transform
1.22
Release 1.22 candidate
#997
opened May 19, 2026 by
quic-dhirajku
Contributor
•
Draft
Add user_vision_size in VLM's get_specializations for chunked embedding in vLLM v1
1.22
Release 1.22 candidate
#996
opened May 18, 2026 by
quic-xiyushi
Contributor
Loading…
Magcache support for Use for PR related to diffusers in efficient-transformers.
performance
Diffuser
Diffusers
#993
opened May 18, 2026 by
quic-amitraj
Contributor
Loading…
[CI-Nightly]: Validating the nightly Result with Previous Result
#992
opened May 18, 2026 by
abukhoy
Contributor
Loading…
Add GLM4-MOE Mode w/Disaggregated Prefill and Decode Support
#988
opened May 14, 2026 by
vbaddi
Contributor
Loading…
support multiple TLM decode specializations via num_speculative_tokens list
1.22
Release 1.22 candidate
#984
opened May 13, 2026 by
eplatero97
Contributor
Loading…
4 tasks done
Adding PagedAttention support for CausalLM models
enhancement
New feature or request
#982
opened May 13, 2026 by
vaibverm
Contributor
Loading…
Fix for fp16/bf16 export & compile in qwen3vl & qwen3vlmoe models
1.22
Release 1.22 candidate
#980
opened May 12, 2026 by
qcdipankar
Contributor
Loading…
Diffusers CI conditional check
Diffusers
Use for PR related to diffusers in efficient-transformers.
#978
opened May 11, 2026 by
quic-amitraj
Contributor
Loading…
Added support of Use for PR related to diffusers in efficient-transformers.
QEffDiffusionPipeline for Diffusers
Diffusers
#977
opened May 11, 2026 by
quic-amitraj
Contributor
Loading…
Enable ffn blocking for dense models with automatic blocking configurator
enhancement
New feature or request
qeff.blocking
#958
opened May 4, 2026 by
kdulla
Contributor
Loading…
fix: improve weight offloading to handle plain tensor attrs and use to_empty()
1.22
Release 1.22 candidate
#952
opened Apr 28, 2026 by
quic-rishinr
Contributor
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-04-25.