Feat/Support-Qwix-Quantization-For-NNX-Models by hsuan-lun-chiang · Pull Request #3781 · AI-Hypercomputer/maxtext

hsuan-lun-chiang · 2026-04-30T07:44:48Z

Description

This PR adds support for running Qwix quantization on NNX models, leveraging the NNX decoder architecture.

Please describe how you tested this change, and include any instructions and/or
commands to reproduce.

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

…mma3/Llama4

…can tracer leaks

hsuan-lun-chiang added 4 commits April 29, 2026 03:38

Implement and update the following models in NNX decoder: DeepSeek/Ge…

445a879

…mma3/Llama4

Fix unit test after rebasing

8756746

Fix: Complete NNX support for Qwix FP8 Quantization and fix jax.lax.s…

9ef0db3

…can tracer leaks

Fix

e06b8df

hsuan-lun-chiang marked this pull request as draft April 30, 2026 07:45

hsuan-lun-chiang changed the title ~~Feat/migrate fp8 quantization test to nnx~~ Feat/Support-Qwix-Quantization-For-NNX-Models Apr 30, 2026