chore(deps): update loader dependencies major (major)#194
Open
dreadnode-renovate-bot[bot] wants to merge 1 commit intomainfrom
Open
chore(deps): update loader dependencies major (major)#194dreadnode-renovate-bot[bot] wants to merge 1 commit intomainfrom
dreadnode-renovate-bot[bot] wants to merge 1 commit intomainfrom
Conversation
07525d6 to
3ac3e72
Compare
3ac3e72 to
4daa5d1
Compare
3e0d62f to
4b95150
Compare
4b95150 to
40a28f1
Compare
85f7052 to
c4f4579
Compare
c4f4579 to
37b26b9
Compare
| datasource | package | from | to | | ---------- | ------------ | ------ | ----- | | pypi | psutil | 6.1.1 | 7.2.2 | | pypi | transformers | 4.57.6 | 5.7.0 |
37b26b9 to
ca4e25e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
==6.1.1→==7.2.2==4.57.6→==5.7.0Warning
Some dependencies could not be looked up. Check the Dependency Dashboard for more information.
Release Notes
giampaolo/psutil (psutil)
v7.2.2Compare Source
v7.2.1Compare Source
v7.2.0Compare Source
v7.1.3Compare Source
v7.1.2Compare Source
v7.1.1Compare Source
v7.1.0Compare Source
v7.0.0Compare Source
huggingface/transformers (transformers)
v5.7.0Compare Source
Release v5.7.0
New Model additions
Laguna
Laguna is Poolside's mixture-of-experts language model family that extends standard SwiGLU MoE transformers with two key innovations. It features per-layer head counts allowing different decoder layers to have different query-head counts while sharing the same KV cache shape, and implements a sigmoid MoE router with auxiliary-loss-free load balancing that uses element-wise sigmoid of gate logits plus learned per-expert bias for router scoring.
Links: Documentation
DEIMv2
DEIMv2 (DETR with Improved Matching v2) is a real-time object detection model that extends DEIM with DINOv3 features and spans eight model sizes from X to Atto for diverse deployment scenarios. It uses a Spatial Tuning Adapter (STA) for larger variants to convert DINOv3's single-scale output into multi-scale features, while ultra-lightweight models employ pruned HGNetv2 backbones. The unified design achieves superior performance-cost trade-offs, with DEIMv2-X reaching 57.8 AP with only 50.3M parameters and DEIMv2-S being the first sub-10M model to exceed 50 AP on COCO.
Links: Documentation | Paper
Attention
Several attention-related bugs were fixed across multiple models, including a cross-attention cache type error in T5Gemma2 for long inputs, incorrect cached forward behavior in Qwen3.5's gated-delta-net linear attention, and a crash in GraniteMoeHybrid when no Mamba layers are present. Attention function dispatch was also updated to align with the latest model implementations.
Tokenizers
There was a bug in AutoTokenizer that caused the wrong tokenizer class to be initialized. This caused regressions in models like DeepSeek R1.
Generation
Continuous batching generation received several fixes and improvements, including correcting KV deduplication and memory estimation for long sequences (16K+), and removing misleading warnings about
num_return_sequencesand other unsupported features that were incorrectly firing even when functionality worked correctly. Documentation for per-request sampling parameters was also added.Kernels
Improved kernel support by fixing configuration reading and error handling for FP8 checkpoints (e.g., Qwen3.5-35B-A3B-FP8), enabling custom expert kernels registered from the HF Hub to be properly loaded, and resolving an incompatibility that prevented Gemma3n and Gemma4 from using the rotary kernel.
Bugfixes and improvements
x_clip: 8 failed test cases (#45394) by @kaixuanliu in [#45394]NameError: PeftConfigLiketriggered byPreTrainedModel.__init_subclass__(#45658) by @qgallouedec in [#45658]clean_up_tokenizationfor BPE tokenizers inPreTrainedTokenizerFast(#44915) by @maxsloef-goodfire in [#44915]supports_gradient_checkpointingtoNemotronHPreTrainedModel(#45625) by @sergiopaniego in [#45625]problem_type="single_label_classification"withnum_labels=1(#45611) by @gaurav0107 in [#45611]AttributeErrorons_aux=Noneinflash_attention_forward(#45589) by @jamesbraza in [#45589]Significant community contributions
The following contributors have made significant changes to the library over the last release:
v5.6.2: Patch release v5.6.2Compare Source
Patch release v5.6.2
Qwen 3.5 and 3.6 MoE (text-only) were broken when using with FP8. It should now work again with this 🫡
Full Changelog: huggingface/transformers@v5.6.1...v5.6.2
v5.6.1: Patch release v5.6.1Compare Source
Patch release v5.6.1
Flash attention path was broken! Sorry everyone for this one 🤗
v5.6.0Compare Source
Release v5.6.0
New Model additions
OpenAI Privacy Filter
OpenAI Privacy Filter is a bidirectional token-classification model for personally identifiable information (PII) detection and masking in text. It is intended for high-throughput data sanitization workflows where teams need a model that they can run on-premises that is fast, context-aware, and tunable. The model labels an input sequence in a single forward pass, then decodes coherent spans with a constrained Viterbi procedure, predicting probability distributions over 8 privacy-related output categories for each input token.
Links: Documentation
Privacy Filter] Add model (#45580) by @vasqu in #45580QianfanOCR
Qianfan-OCR is a 4B-parameter end-to-end document intelligence model developed by Baidu that performs direct image-to-text conversion without traditional multi-stage OCR pipelines. It supports a broad range of prompt-driven tasks including structured document parsing, table extraction, chart understanding, document question answering, and key information extraction all within one unified model. The model features a unique "Layout-as-Thought" capability that generates structured layout representations before producing final outputs, making it particularly effective for complex documents with mixed element types.
Links: Documentation | Paper
SAM3-LiteText
SAM3-LiteText is a lightweight variant of SAM3 that replaces the heavy SAM3 text encoder (353M parameters) with a compact MobileCLIP-based text encoder optimized through knowledge distillation, while keeping the SAM3 ViT-H image encoder intact. This reduces text encoder parameters by up to 88% while maintaining segmentation performance comparable to the original model. The model enables efficient vision-language segmentation by addressing the redundancy found in text prompting for segmentation tasks.
Links: Documentation | Paper
SLANet
SLANet and SLANet_plus are lightweight models designed for table structure recognition, focusing on accurately recognizing table structures in documents and natural scenes. The model improves accuracy and inference speed by adopting a CPU-friendly lightweight backbone network PP-LCNet, a high-low-level feature fusion module CSP-PAN, and a feature decoding module SLA Head that aligns structural and positional information. SLANet was developed by Baidu PaddlePaddle Vision Team as part of their table structure recognition solutions.
Links: Documentation
Breaking changes
The internal
rotary_fnis no longer registered as a hidden kernel function, so any code referencingself.rotary_fn(...)within an Attention module will break and must be updated to call the function directly instead.Kernels] Fix kernel function registration (#45420) by @vasquServe
The
transformers servecommand received several enhancements, including a new/v1/completionsendpoint for legacy text completion, multimodal support for audio and video inputs, improved tool-calling viaparse_response, proper forwarding oftool_calls/tool_call_idfields, a 400 error on model mismatch when the server is pinned to a specific model, and fixes for the response API. Documentation was also updated to cover new serving options such as--compileand--model-timeout.transformers serve(#44558) by @rain-1 in [#44558]transformers serveis pinned (#45443) by @qgallouedec in [#45443]parse_response(#45485) by @SunMarc in [#45485]tool_calls/tool_call_idin processor inputs (#45418) by @qgallouedec in [#45418]Vision
Several vision-related bug fixes were applied in this release, including correcting Qwen2.5-VL temporal RoPE scaling for still images, fixing missing/mismatched image processor backends for Emu3 and BLIP, resolving modular image processor class duplication, and preventing accelerate from incorrectly splitting vision encoders in PeVideo/PeAudioVideo models. Image loading performance was also improved by leveraging torchvision's native
decode_imagein the torchvision backend, yielding up to ~17% speedup over PIL-based loading.decode_imageto load images in the torchvision backend (#45195) by @yonigozlan in [#45195]Parallelization
Fixed several bugs affecting distributed training, including silently wrong results or NaN loss with Expert Parallelism, NaN weights on non-rank-0 FSDP processes, and a resize failure in PP-DocLayoutV3; additionally added support for loading adapters with Tensor Parallelism, added MoE to the Gemma4 TP plan, and published documentation for TP training.
Tokenization
Fixed a docstring typo in streamer classes, resolved a Kimi-K2.5 tokenizer regression and
_patch_mistral_regexAttributeError, and patched a streaming generation crash forQwen3VLProcessorcaused by incorrect_tokenizerattribute access. Additional housekeeping included moving the GPT-SW3 instruct tokenizer to an internal testing repo and fixing a global state leak in the tokenizer registry during tests.Tokenizers] Move gpt sw3 tokenizer out (#45404) by @vasqu in [#45404]test_processors(#45318) by @tarekziade in [#45318]Cache
Cache handling was improved for Gemma4 and Gemma3n models by dissociating KV state sharing from the Cache class, ensuring KV states are always shared regardless of whether a Cache is used. Additionally, the image cache for Paddle models was updated to align with the latest API.
Audio
Audio models gained vLLM compatibility through targeted fixes across several model implementations, while reliability improvements were also made including exponential back-off retries for audio file downloads, a crash fix in the
text-to-speechpipeline when generation configs containNonevalues, and corrected test failures for Kyutai Speech-To-Text.text-to-speechpipeline crash when generation config containsNonevalues (#45107) by @jiqing-feng in [#45107]Bugfixes and improvements
Privacy Filter] Add model (#45580) by @vasqu in [#45580]pass(inherits from DSV3 MoE) (#45572) by @casinca in [#45572]DeepseekV3MoEand remote official implementation (#45441) by @casinca in [#45441]prepare_decoder_input_ids_from_labels(#45516) by @Tokarak in [#45516]TextToAudioPipelinemissing<bos>token (#45525) by @jiqing-feng in [#45525]Conversion Mapping] Small fixups (#45483) by @vasqu in [#45483]get_image_sizemethod (#45461) by @JiauZhang in [#45461]fix] Always early return for non-Mistral models in _patch_mistral_regex (#45444) by @tomaarsen in [#45444]fix] Make Qwen2_5OmniProcessor warning a lot less noisy via warning_once (#45455) by @tomaarsen in [#45455]Configuration
📅 Schedule: (UTC)
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
👻 Immortal: This PR will be recreated if closed unmerged. Get config help if that's undesired.
This PR has been generated by Mend Renovate.