chore(deps): update loader dependencies major (major) by dreadnode-renovate-bot[bot] · Pull Request #194 · dreadnode/dyana

dreadnode-renovate-bot · 2026-02-24T20:12:14Z

ℹ️ Note

This PR body was truncated due to platform limits.

This PR contains the following updates:

Package	Change	Age	Confidence
psutil	`==6.1.1` → `==7.2.2`
transformers	`==4.57.6` → `==5.7.0`

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.

Release Notes

giampaolo/psutil (psutil)

huggingface/transformers (transformers)

`v5.7.0`

Compare Source

Release v5.7.0

New Model additions

Laguna

Laguna is Poolside's mixture-of-experts language model family that extends standard SwiGLU MoE transformers with two key innovations. It features per-layer head counts allowing different decoder layers to have different query-head counts while sharing the same KV cache shape, and implements a sigmoid MoE router with auxiliary-loss-free load balancing that uses element-wise sigmoid of gate logits plus learned per-expert bias for router scoring.

Links: Documentation

Laguna XS.2 implementation (#45673) by @joerowell in #45673

DEIMv2

DEIMv2 (DETR with Improved Matching v2) is a real-time object detection model that extends DEIM with DINOv3 features and spans eight model sizes from X to Atto for diverse deployment scenarios. It uses a Spatial Tuning Adapter (STA) for larger variants to convert DINOv3's single-scale output into multi-scale features, while ultra-lightweight models employ pruned HGNetv2 backbones. The unified design achieves superior performance-cost trade-offs, with DEIMv2-X reaching 57.8 AP with only 50.3M parameters and DEIMv2-S being the first sub-10M model to exceed 50 AP on COCO.

Links: Documentation | Paper

model: Add DEIMv2 to Transformers (#44339) by @harshaljanjani in #44339

Attention

Several attention-related bugs were fixed across multiple models, including a cross-attention cache type error in T5Gemma2 for long inputs, incorrect cached forward behavior in Qwen3.5's gated-delta-net linear attention, and a crash in GraniteMoeHybrid when no Mamba layers are present. Attention function dispatch was also updated to align with the latest model implementations.

Fix cross-attention cache layer type for T5Gemma2 long inputs (#45540) by @Beichen-Ma in [#45540]
[Qwen3.5] Fix GDN linear attention multi-token cached forward (#45513) by @kashif in [#45513]
Fix GraniteMoeHybrid _update_mamba_mask crash on attention-only models (#45514) by @tianhaocui in [#45514]
Align latest model attention function dispatch (#45598) by @Cyrilvallez in [#45598]

Tokenizers

There was a bug in AutoTokenizer that caused the wrong tokenizer class to be initialized. This caused regressions in models like DeepSeek R1.

change got reverted (#45680) by @itazap in [#45680]

Generation

Continuous batching generation received several fixes and improvements, including correcting KV deduplication and memory estimation for long sequences (16K+), and removing misleading warnings about num_return_sequences and other unsupported features that were incorrectly firing even when functionality worked correctly. Documentation for per-request sampling parameters was also added.

generate: drop stale num_return_sequences warning on continuous batching path (#45582) by @joaquinhuigomez in [#45582]
Remove unnecessary generate warnings (#45619) by @Cyrilvallez in [#45619]
[CB] Changes for long generation (#45530) by @remi-or in [#45530]
[docs] per-request sampling params (#45553) by @stevhliu in [#45553]

Kernels

Improved kernel support by fixing configuration reading and error handling for FP8 checkpoints (e.g., Qwen3.5-35B-A3B-FP8), enabling custom expert kernels registered from the HF Hub to be properly loaded, and resolving an incompatibility that prevented Gemma3n and Gemma4 from using the rotary kernel.

Fix configuration reading and error handling for kernels (#45610) by @hmellor in [#45610]
Allow for registered experts from kernels hub (#45577) by @winglian in [#45577]
Gemma3n and Gemma4 cannot use rotary kernel (#45564) by @Cyrilvallez in [#45564]

Bugfixes and improvements

fixing more typos (#45689) by @vasqu in [#45689]
[docs] cb memory management (#45587) by @stevhliu in [#45587]
[docs] cpu offloading (#45660) by @stevhliu in [#45660]
docs(README_zh-hans): clarify conditions for not using Transformers (#45688) by @GuaiZai233 in [#45688]
fix padding side issue for fast_vlm tests (#45592) by @kaixuanliu in [#45592]
Fix x_clip: 8 failed test cases (#45394) by @kaixuanliu in [#45394]
zero_shot_object_detection ValueError fix for python 3.13 (#45669) by @AnkitAhlawat7742 in [#45669]
Fix pageable H2D copies in Gated DeltaNet PyTorch fallback (#45665) by @ruixiang63 in [#45665]
Fix UnboundLocalError in shard_and_distribute_module for replicated parameters (#45675) by @Abdennacer-Badaoui in [#45675]
[MistralCommonBackend] Soften validation mode and apply_chat_template arguments check (#45628) by @juliendenize in [#45628]
Fix NameError: PeftConfigLike triggered by PreTrainedModel.__init_subclass__ (#45658) by @qgallouedec in [#45658]
chore(typing): added modeling_utils to ty (#45425) by @tarekziade in [#45425]
[gemma4] infer from config instead of hardcoding (#45606) by @eustlb in [#45606]
Update quants tests (#45480) by @SunMarc in [#45480]
🔴🔴🔴 fix: skip clean_up_tokenization for BPE tokenizers in PreTrainedTokenizerFast (#44915) by @maxsloef-goodfire in [#44915]
Fix colmodernvbert tests (#45652) by @Cyrilvallez in [#45652]
[CB] [Major] Add CPU request offloading (#45184) by @remi-or in [#45184]
Fix peft constructors (#45622) by @Cyrilvallez in [#45622]
chore: speedup modular converter (~30%) (#45046) by @tarekziade in [#45046]
Fix whisper return language (#42227) by @FredHaa in [#42227]
Add supports_gradient_checkpointing to NemotronHPreTrainedModel (#45625) by @sergiopaniego in [#45625]
Raise clear error for problem_type="single_label_classification" with num_labels=1 (#45611) by @gaurav0107 in [#45611]
CircleCI with torch 2.11 (#45633) by @ydshieh in [#45633]
chore: bump doc-builder SHA for main doc build workflow (#45631) by @rtrompier in [#45631]
Allow more artifacts to be download in CI (#45629) by @ydshieh in [#45629]
chore(qa): split pipeline and add type checking (#45432) by @tarekziade in [#45432]
Skip failing offloading tests (#45624) by @Cyrilvallez in [#45624]
fix: compute auxiliary losses when denoising is disabled in D-FINE (#45601) by @Abineshabee in [#45601]
qa: bumped mlinter and allow local override (#45585) by @tarekziade in [#45585]
Processing Utils: continue when content is a string (#45605) by @RyanMullins in [#45605]
SonicMoe (#45433) by @IlyasMoutawwakil in [#45433]
fix transformers + torchao nvfp4 serialization (#45573) by @vkuzo in [#45573]
[AMD CI] Fix expectations for Gemma3n (#45602) by @Abdennacer-Badaoui in [#45602]
[docs] multi-turn tool calling (#45554) by @stevhliu in [#45554]
Fix AttributeError on s_aux=None in flash_attention_forward (#45589) by @jamesbraza in [#45589]
do not index past decoded chars with special tokens (#45435) by @itazap in [#45435]
Update dev version (#45583) by @vasqu in [#45583]
Update torchao usage for XPU and CPU (#45560) by @jiqing-feng in [#45560]

Significant community contributions

The following contributors have made significant changes to the library over the last release:

@vasqu
- fixing more typos (#45689)
- Update dev version (#45583)
@joerowell
- Laguna XS.2 implementation (#45673)
@tarekziade
- chore(typing): added modeling_utils to ty (#45425)
- chore: speedup modular converter (~30%) (#45046)
- chore(qa): split pipeline and add type checking (#45432)
- qa: bumped mlinter and allow local override (#45585)
@harshaljanjani
- model: Add DEIMv2 to Transformers (#44339)
@remi-or
- [CB] [Major] Add CPU request offloading (#45184)
- [CB] Changes for long generation (#45530)

`v5.6.2`: Patch release v5.6.2

Compare Source

Patch release v5.6.2

Qwen 3.5 and 3.6 MoE (text-only) were broken when using with FP8. It should now work again with this 🫡

Fix configuration reading and error handling for kernels (#45610) by @hmellor

Full Changelog: huggingface/transformers@v5.6.1...v5.6.2

`v5.6.1`: Patch release v5.6.1

Compare Source

Patch release v5.6.1

Flash attention path was broken! Sorry everyone for this one 🤗

Fix AttributeError on s_aux=None in flash_attention_forward (#45589) by @jamesbraza

`v5.6.0`

Compare Source

Release v5.6.0

New Model additions

OpenAI Privacy Filter

OpenAI Privacy Filter is a bidirectional token-classification model for personally identifiable information (PII) detection and masking in text. It is intended for high-throughput data sanitization workflows where teams need a model that they can run on-premises that is fast, context-aware, and tunable. The model labels an input sequence in a single forward pass, then decodes coherent spans with a constrained Viterbi procedure, predicting probability distributions over 8 privacy-related output categories for each input token.

Links: Documentation

[Privacy Filter] Add model (#45580) by @vasqu in #45580

QianfanOCR

Qianfan-OCR is a 4B-parameter end-to-end document intelligence model developed by Baidu that performs direct image-to-text conversion without traditional multi-stage OCR pipelines. It supports a broad range of prompt-driven tasks including structured document parsing, table extraction, chart understanding, document question answering, and key information extraction all within one unified model. The model features a unique "Layout-as-Thought" capability that generates structured layout representations before producing final outputs, making it particularly effective for complex documents with mixed element types.

Links: Documentation | Paper

add Qianfan-OCR model definition (#45280) by @marvinzh in #45280

SAM3-LiteText

SAM3-LiteText is a lightweight variant of SAM3 that replaces the heavy SAM3 text encoder (353M parameters) with a compact MobileCLIP-based text encoder optimized through knowledge distillation, while keeping the SAM3 ViT-H image encoder intact. This reduces text encoder parameters by up to 88% while maintaining segmentation performance comparable to the original model. The model enables efficient vision-language segmentation by addressing the redundancy found in text prompting for segmentation tasks.

Links: Documentation | Paper

Add SAM3-LiteText (#44320) by @NielsRogge in #44320

SLANet

SLANet and SLANet_plus are lightweight models designed for table structure recognition, focusing on accurately recognizing table structures in documents and natural scenes. The model improves accuracy and inference speed by adopting a CPU-friendly lightweight backbone network PP-LCNet, a high-low-level feature fusion module CSP-PAN, and a feature decoding module SLA Head that aligns structural and positional information. SLANet was developed by Baidu PaddlePaddle Vision Team as part of their table structure recognition solutions.

Links: Documentation

[Model] Add SLANet Model Support (#45532) by @zhang-prog in #45532

Breaking changes

The internal rotary_fn is no longer registered as a hidden kernel function, so any code referencing self.rotary_fn(...) within an Attention module will break and must be updated to call the function directly instead.

🚨 [Kernels] Fix kernel function registration (#45420) by @vasqu

Serve

The transformers serve command received several enhancements, including a new /v1/completions endpoint for legacy text completion, multimodal support for audio and video inputs, improved tool-calling via parse_response, proper forwarding of tool_calls/tool_call_id fields, a 400 error on model mismatch when the server is pinned to a specific model, and fixes for the response API. Documentation was also updated to cover new serving options such as --compile and --model-timeout.

Add /v1/completions endpoint (OpenAI legacy completions API) to transformers serve (#44558) by @rain-1 in [#44558]
Updated the image cache for Paddle models according to the latest API (#45562) by @zhang-prog in [#45562]
Raise 400 on model mismatch when transformers serve is pinned (#45443) by @qgallouedec in [#45443]
[serve] Update tool call to switch to parse_response (#45485) by @SunMarc in [#45485]
Fix response api support (#45463) by @SunMarc in [#45463]
[serve] Forward tool_calls/tool_call_id in processor inputs (#45418) by @qgallouedec in [#45418]
refactor(qa): extend extras so ty can run on server modules (#45456) by @tarekziade in [#45456]
Multimodal serve support (#45220) by @SunMarc in [#45220]
[docs] transformers serve (#45174) by @stevhliu in [#45174]

Vision

Several vision-related bug fixes were applied in this release, including correcting Qwen2.5-VL temporal RoPE scaling for still images, fixing missing/mismatched image processor backends for Emu3 and BLIP, resolving modular image processor class duplication, and preventing accelerate from incorrectly splitting vision encoders in PeVideo/PeAudioVideo models. Image loading performance was also improved by leveraging torchvision's native decode_image in the torchvision backend, yielding up to ~17% speedup over PIL-based loading.

Revert "Fix: modular image processors (#45492)" (#45531) by @tarekziade in [#45531]
Fix: modular image processors (#45492) by @zucchini-nlp in [#45492]
fix: prevent accelerate from splitting vision encoder by setting no… (#43047) by @ in [#43047]
Fix Qwen2.5-VL temporal RoPE scaling applied to still images (#45330) by @Kash6 in [#45330]
Use torchvision decode_image to load images in the torchvision backend (#45195) by @yonigozlan in [#45195]
Fix missing image processors backends (#45165) by @zucchini-nlp in [#45165]

Parallelization

Fixed several bugs affecting distributed training, including silently wrong results or NaN loss with Expert Parallelism, NaN weights on non-rank-0 FSDP processes, and a resize failure in PP-DocLayoutV3; additionally added support for loading adapters with Tensor Parallelism, added MoE to the Gemma4 TP plan, and published documentation for TP training.

Fix EP: RouterParallel shape, tp_plan property, grouped_mm sentinels (#45473) by @AmineDiro in [#45473]
Fix NaN weights on non-rank-0 FSDP processes (#45050) by @albertvillanova in [#45050]
Load adapter with TP (#45155) by @michaelbenayoun in [#45155]
[docs] tp training (#44613) by @stevhliu in [#44613]
Fix resize failure caused by zero-sized masks in PP-DocLayoutV3 (#45281) by @zhang-prog in [#45281]
Add MoE to Gemma4 TP plan (#45219) by @sywangyi in [#45219]

Tokenization

Fixed a docstring typo in streamer classes, resolved a Kimi-K2.5 tokenizer regression and _patch_mistral_regex AttributeError, and patched a streaming generation crash for Qwen3VLProcessor caused by incorrect _tokenizer attribute access. Additional housekeeping included moving the GPT-SW3 instruct tokenizer to an internal testing repo and fixing a global state leak in the tokenizer registry during tests.

[Doc] Fix 'tokenized' -> 'tokenizer' typo in streamer docstrings (#45508) by @avasis-ai in [#45508]
Fix Kimi-K2.5 tokenizer regression and _patch_mistral_regex AttributeError (#45359) by @ArthurZucker in [#45359]
fix(serving): resolve rust tokenizer from ProcessorMixin in streaming generation (#45368) by @sharziki in [#45368]
[Tokenizers] Move gpt sw3 tokenizer out (#45404) by @vasqu in [#45404]
fix: leak in tokenizer registry for test_processors (#45318) by @tarekziade in [#45318]

Cache

Cache handling was improved for Gemma4 and Gemma3n models by dissociating KV state sharing from the Cache class, ensuring KV states are always shared regardless of whether a Cache is used. Additionally, the image cache for Paddle models was updated to align with the latest API.

Align gemma3n cache sharing to gemma4 (#45489) by @Cyrilvallez in [#45489]
remove cache file from tree (#45392) by @tarekziade in [#45392]
[gemma4] Dissociate kv states sharing from the Cache (#45312) by @Cyrilvallez in [#45312]

Audio

Audio models gained vLLM compatibility through targeted fixes across several model implementations, while reliability improvements were also made including exponential back-off retries for audio file downloads, a crash fix in the text-to-speech pipeline when generation configs contain None values, and corrected test failures for Kyutai Speech-To-Text.

feat[vLLM × v5]: Add vLLM compatibility for audio models (#45326) by @harshaljanjani in [#45326]
http retries on audio file downloads (#45126) by @tarekziade in [#45126]
fix(testing): Fix Kyutai Speech-To-Text and LongCatFlash test failures on main CI (#44695) by @harshaljanjani in [#44695]
Fix text-to-speech pipeline crash when generation config contains None values (#45107) by @jiqing-feng in [#45107]

Bugfixes and improvements

[Privacy Filter] Add model (#45580) by @vasqu in [#45580]
Add ForSequenceClassification heads for the OLMo family (#45551) by @earino in [#45551]
Add IndexCache support for GLM5 DSA (#45424) by @louzongzhi in [#45424]
Fix redundant logic in video processing SmolVLM (#45272) by @yonigozlan in [#45272]
Fix typos (#45574) by @vasqu in [#45574]
[Model] Add SLANet Model Support (#45532) by @zhang-prog in [#45532]
refactor(Dots1): drop Dots1MoE override to pass (inherits from DSV3 MoE) (#45572) by @casinca in [#45572]
perf: avoid recomputing rotary_emb for each layer in some Google and ModernBERT models (#45555) by @casinca in [#45555]
Gemma4 training with text-only samples (#45454) by @zucchini-nlp in [#45454]
[nemotron_h] Add support for MLP mixers (#44763) by @xenova in [#44763]
add expert parallelism for gemma-4-26B-A4B-it (#45279) by @sywangyi in [#45279]
Add full GGUF loading support for GPT‑OSS (fixes #43366, supersedes #43757) latest (#45506) by @sirzechs66 in [#45506]
Update Gemma4 weight conversion script (#45328) by @RyanMullins in [#45328]
Move some conversion mappings to PrefixChange (#45567) by @Cyrilvallez in [#45567]
fix table update versions (#45544) by @tarekziade in [#45544]
Add disable_mmap kwarg to from_pretrained with hf-mount auto-detection (#45547) by @rtrompier in [#45547]
fix(DSV3): parity between native DeepseekV3MoE and remote official implementation (#45441) by @casinca in [#45441]
[modular] Fix modular logic broken in #45045 (#45539) by @Cyrilvallez in [#45539]
Fix: propagate quantization_config to text sub-config for composite models in AutoModelForCausalLM (#45494) by @lvliang-intel in [#45494]
T5Gemma2: fix prepare_decoder_input_ids_from_labels (#45516) by @Tokarak in [#45516]
[Trainer] Add ddp_static_graph option (#45519) by @KeitaW in [#45519]
Add dtype config options for Four Over Six (#45367) by @jackcook in [#45367]
[Sam3LiteText] Remove unnecessary modules/configs (#45535) by @yonigozlan in [#45535]
Fix conditional check for float formatting (#44425) by @qgallouedec in [#44425]
Fix AMD CI: rebuild torchvision with libjpeg + refresh expectations (#45533) by @Abdennacer-Badaoui in [#45533]
Reapply modular to examples (#45527) by @Cyrilvallez in [#45527]
qa: re-run modular converter when the script itself is modified (#45528) by @tarekziade in [#45528]
[GGUF] Reduce peak RAM usage by casting dequantized tensors early during load (#45386) by @UsamaKenway in [#45386]
Fix CSM TextToAudioPipeline missing <bos> token (#45525) by @jiqing-feng in [#45525]
[Conversion Mapping] Small fixups (#45483) by @vasqu in [#45483]
fix: return empty tuple from import_protobuf_decode_error when protobuf is unavailable (#45486) by @jw9603 in [#45486]
throw error when conversion required (#45078) by @itazap in [#45078]
chore: bump doc-builder SHA for PR upload workflow (#45450) by @rtrompier in [#45450]
xpu output align with cuda in test case (#45526) by @sywangyi in [#45526]
chore(qa): split out mlinter (#45475) by @tarekziade in [#45475]
[loading] Clean way to add/remove full parts in checkpoint names (#45448) by @Cyrilvallez in [#45448]
Fix Zamba2MambaMixer ignoring use_mamba_kernels=False (#44853) by @sergiopaniego in [#44853]
revert sha commit pointing to main for transformers_amd_ci_ workflows (#45495) by @paulinebm in [#45495]
Fix ZeRO-3 from_pretrained: load registered buffers in _load_state_dict_into_zero3_model (#45402) by @saslifat-gif in [#45402]
Remove redundant condition checks in get_image_size method (#45461) by @JiauZhang in [#45461]
Add check-auto in repo-consistency and fix sorting (#45481) by @zucchini-nlp in [#45481]
Fix typos in src/transformers/utils/output_capturing.py (#45269) by @ryota-komatsu in [#45269]
typing: rule 15 - checks for tie_word_embeddings presence (#44988) by @tarekziade in [#44988]
[CB] Fix capture of max_seqlen (#45323) by @remi-or in [#45323]
Minor update (#45484) by @ydshieh in [#45484]
Add Neuron to auto-compile hardware list (#44757) by @dacorvo in [#44757]
Allow loading Qwen Thinker 'base' models without generative head (#45457) by @tomaarsen in [#45457]
[fix] Always early return for non-Mistral models in _patch_mistral_regex (#45444) by @tomaarsen in [#45444]
Fix spurious position_ids warnings for at least 40 architectures (#45437) by @tomaarsen in [#45437]
[fix] Make Qwen2_5OmniProcessor warning a lot less noisy via warning_once (#45455) by @tomaarsen in [#45455]
Dynamic auto mapping (#45018) by @zucchini-nlp in [#45018]
[docs] vlm addition (#45271) by @stevhliu in [#45271]
fix: dont download artifacts from the test hub (#45319) by @tarekziade in [#45319]
fix(clipseg): fix 2 failing tests (#45403) by @kaixuanliu in [#45403]
[docs] @auto_docstring decorator (#45130) by @stevhliu in [#45130]
Fix Sam3Processor missing input_boxes_labels for padded None entries (#45171) by @Kash6 in [#45171]
better grad acc tests (#45434) by @SunMarc in [[#45434](https://redirect.github.com/huggingface/transformer

✂ Note

PR body was truncated to here.

Configuration

📅 Schedule: (UTC)

Branch creation
- At any time (no schedule defined)
Automerge
- At any time (no schedule defined)

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

👻 Immortal: This PR will be recreated if closed unmerged. Get config help if that's undesired.

If you want to rebase/retry this PR, check this box

This PR has been generated by Mend Renovate.

| datasource | package | from | to | | ---------- | ------------ | ------ | ----- | | pypi | psutil | 6.1.1 | 7.2.2 | | pypi | transformers | 4.57.6 | 5.7.0 |

dreadnode-renovate-bot Bot added the type/digest Dependency digest updates label Feb 24, 2026

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch 3 times, most recently from 07525d6 to 3ac3e72 Compare March 1, 2026 00:53

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch from 3ac3e72 to 4daa5d1 Compare March 8, 2026 00:48

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch 2 times, most recently from 3e0d62f to 4b95150 Compare April 1, 2026 00:57

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch from 4b95150 to 40a28f1 Compare April 8, 2026 00:52

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch 2 times, most recently from 85f7052 to c4f4579 Compare April 19, 2026 00:59

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch from c4f4579 to 37b26b9 Compare April 26, 2026 01:01

chore(deps): update loader dependencies major

ca4e25e

| datasource | package | from | to | | ---------- | ------------ | ------ | ----- | | pypi | psutil | 6.1.1 | 7.2.2 | | pypi | transformers | 4.57.6 | 5.7.0 |

dreadnode-renovate-bot Bot force-pushed the renovate/major-loader-deps-major branch from 37b26b9 to ca4e25e Compare May 3, 2026 01:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(deps): update loader dependencies major (major)#194

chore(deps): update loader dependencies major (major)#194
dreadnode-renovate-bot[bot] wants to merge 1 commit intomainfrom
renovate/major-loader-deps-major

dreadnode-renovate-bot Bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

dreadnode-renovate-bot Bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Notes

Release v5.7.0

New Model additions

Laguna

DEIMv2

Attention

Tokenizers

Generation

Kernels

Bugfixes and improvements

Significant community contributions

v5.6.2: Patch release v5.6.2

Patch release v5.6.2

v5.6.1: Patch release v5.6.1

Patch release v5.6.1

Release v5.6.0

New Model additions

OpenAI Privacy Filter

QianfanOCR

SAM3-LiteText

SLANet

Breaking changes

Serve

Vision

Parallelization

Tokenization

Cache

Audio

Bugfixes and improvements

Configuration

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

dreadnode-renovate-bot Bot commented Feb 24, 2026 •

edited

Loading

`v5.6.2`: Patch release v5.6.2

`v5.6.1`: Patch release v5.6.1