Skip to content

Pull requests: quic/efficient-transformers

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[WIP] Fix for acc issue in Qwen3 VL moe
#1010 opened May 25, 2026 by tv-karthikeya Contributor Loading…
feat(skip-softmax): Add skip-softmax support for KV-blocked attention enhancement New feature or request
#1009 opened May 25, 2026 by vbaddi Contributor Draft
[Docs]: release/v1.21.6 doc update
#1007 opened May 25, 2026 by abukhoy Contributor Loading…
Change CCL input precision from int8 to int64 to align with compiler
#1003 opened May 22, 2026 by vjanfaza Contributor Loading…
Qwen image with magcache Diffusers Use for PR related to diffusers in efficient-transformers. performance
#998 opened May 20, 2026 by quic-amitraj Contributor Loading…
Repeatkv transform 1.22 Release 1.22 candidate
#997 opened May 19, 2026 by quic-dhirajku Contributor Draft
Add user_vision_size in VLM's get_specializations for chunked embedding in vLLM v1 1.22 Release 1.22 candidate
#996 opened May 18, 2026 by quic-xiyushi Contributor Loading…
Dflash: Block Diffusion Speculative Decoding
#995 opened May 18, 2026 by vjanfaza Contributor Draft
Ft_v1 QAIC-profiler hotfix
#994 opened May 18, 2026 by quic-akuruvil Contributor Loading…
Magcache support for Diffuser Diffusers Use for PR related to diffusers in efficient-transformers. performance
#993 opened May 18, 2026 by quic-amitraj Contributor Loading…
[CI-Nightly]: Validating the nightly Result with Previous Result
#992 opened May 18, 2026 by abukhoy Contributor Loading…
Feat/enable glm4 moe
#991 opened May 15, 2026 by ochougul Contributor Loading…
Add GLM4-MOE Mode w/Disaggregated Prefill and Decode Support
#988 opened May 14, 2026 by vbaddi Contributor Loading…
Added head parallel kv blocking enhancement New feature or request qeff.blocking
#986 opened May 14, 2026 by kdulla Contributor Draft
support multiple TLM decode specializations via num_speculative_tokens list 1.22 Release 1.22 candidate
#984 opened May 13, 2026 by eplatero97 Contributor Loading…
4 tasks done
Adding PagedAttention support for CausalLM models enhancement New feature or request
#982 opened May 13, 2026 by vaibverm Contributor Loading…
Fix for fp16/bf16 export & compile in qwen3vl & qwen3vlmoe models 1.22 Release 1.22 candidate
#980 opened May 12, 2026 by qcdipankar Contributor Loading…
Diffusers CI conditional check Diffusers Use for PR related to diffusers in efficient-transformers.
#978 opened May 11, 2026 by quic-amitraj Contributor Loading…
Added support of QEffDiffusionPipeline for Diffusers Diffusers Use for PR related to diffusers in efficient-transformers.
#977 opened May 11, 2026 by quic-amitraj Contributor Loading…
Layerwise int4 kimi
#973 opened May 7, 2026 by abhishek-singh591 Contributor Draft
TF and other package update
#967 opened May 6, 2026 by quic-hemagnih Contributor Draft
Gemma4 1.22 Release 1.22 candidate
#966 opened May 6, 2026 by tchawada Contributor Loading…
MLA Int4 Changes
#962 opened May 5, 2026 by quic-mamta Contributor Draft
Enable ffn blocking for dense models with automatic blocking configurator enhancement New feature or request qeff.blocking
#958 opened May 4, 2026 by kdulla Contributor Loading…
fix: improve weight offloading to handle plain tensor attrs and use to_empty() 1.22 Release 1.22 candidate
#952 opened Apr 28, 2026 by quic-rishinr Contributor Loading…
ProTip! What’s not been updated in a month: updated:<2026-04-25.