Minimax M2.5 MXFP4 benchmark for MI355x vLLM v0.19.1 (TP=1,2,4) by functionstackx · Pull Request #827 · SemiAnalysisAI/InferenceX

functionstackx · 2026-03-01T04:14:09Z

Add MiniMax M2.5 MXFP4 benchmark config for MI355x with vLLM v0.17.1, now that AMD's MXFP4 checkpoint is out: https://huggingface.co/amd/MiniMax-M2.5-MXFP4

Model: amd/MiniMax-M2.5-MXFP4
Image: vllm/vllm-openai-rocm:v0.17.1
TP=2 and TP=4 (matching MiniMax M2.5 FP8 pattern)
VLLM_ROCM_USE_AITER=1, with AITER MoE fallback for TP>=4
Seq lengths: 1k1k, 1k8k, 8k1k (conc 4-64)

Closes #826

Generated with Claude Code

functionstackx · 2026-03-01T04:43:02Z

@claude can u comment out TP=4 and only do TP=2 for now since TP=4 is bugged vllm-project/vllm#35637

Klaud-Cold · 2026-03-01T04:43:24Z

Claude finished @functionstackx's task in 53s —— View job

Commented out TP=4 for MiniMax M2.1 MXFP4

Due to vllm-project/vllm#35637, TP=4 has been disabled. Only TP=2 will run.

Read current benchmark files
Comment out TP=4 in .github/configs/amd-master.yaml (all 3 seq-len configs)
Update perf-changelog.yaml with PR link and description
Commit and push changes

Changes made:

.github/configs/amd-master.yaml: Commented out all 3 tp: 4 search-space entries under minimaxm2.1-fp4-mi355x-vllm with a reference to the vLLM issue
perf-changelog.yaml: Updated entry to reflect TP=2 only and added PR Minimax M2.5 MXFP4 benchmark for MI355x vLLM v0.19.1 (TP=1,2,4) #827 link
|

functionstackx · 2026-03-01T07:11:04Z

@chunfangamd i am seeing good mxfp4 performance when comparing to FP8. just waiting for AMD to release mxfp4 version of MiniMax M2.5 now. currently AMD only have mxfp4 version of MiniMax M2.1.

chunfangamd

Looks good to me

Add error handling for model download failure

Use sudo for permission changes and directory removal.

seungrokj · 2026-04-20T03:49:28Z

@benenzhu vllm/vllm-openai-rocm:v0.19.1 is ready, can you submit a PR ? cc. @ajith-sirra-amd @functionstackx

functionstackx · 2026-04-20T03:52:59Z

Feel free to ping me for quick review when u have a PR for minimax mxfp4 that is working and has passed pr validation and there is an vllm-projects/recipes update (if needed)

benenzhu · 2026-04-20T03:56:03Z

https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24647578553/job/72063351980?pr=827

@seungrokj Hi, I retriggerd the CI, seems the mia1-p01-g12 machine's huggingface dir is broken now. Can't load the model. Only this machine will fail

functionstackx · 2026-04-20T04:18:38Z

@chunfangamd can u take a look at debugging this? Probably previous runs have corrupted the huggingface model cache from @benenzhu testing. The fix is just rm this minimax ckpt folder on the cluster

benenzhu · 2026-04-20T04:23:19Z

@chunfangamd can u take a look at debugging this? Probably previous runs have corrupted the huggingface model cache from @benenzhu testing. The fix is just rm this minimax ckpt folder on the cluster

@functionstackx Hi I think that's some gh runner problem in this machine, the docker command also fails.

functionstackx · 2026-04-20T05:10:29Z

@chunfangamd can u take a look at debugging this? Probably previous runs have corrupted the huggingface model cache from @benenzhu testing. The fix is just rm this minimax ckpt folder on the cluster

@functionstackx Hi I think that's some gh runner problem in this machine, the docker command also fails.

Both @chunfangamd and/or @cquil11 have cluster access to fix it. @chunfangamd can u help?

benenzhu · 2026-04-20T05:29:33Z

@functionstackx Hi, I have disable the bad machine and the actions have passed. May you help review it?

Same script as minimax fp8 and no additional flags. So no need update to vllm recipes.
Accuracy result:

https://inferencex.semianalysis.com/inference?unofficialRun=24648550630&g_rundate=2026-04-17&g_runid=24588340987&g_model=MiniMax-M2.5

functionstackx · 2026-04-20T06:42:35Z

@cquil11 or @Oseltamivir csn u review this pr

functionstackx · 2026-04-20T06:43:08Z

I am the original author of this pull request so GitHub doesn't let me approve my own pr

functionstackx · 2026-04-20T06:45:18Z

@benenzhu one issue I see before I go to sleep is that vllm-project/recipes doesn't have an recipe for this minimax M2.5 mxfp4. We wanna ensure that the entire ml community benefits from ur hard work. Can u please create an recipe for it in that documentation

benenzhu · 2026-04-20T06:49:48Z

@benenzhu one issue I see before I go to sleep is that vllm-project/recipes doesn't have an recipe for this minimax M2.5 mxfp4. We wanna ensure that the entire ml community benefits from ur hard work. Can u please create an recipe for it in that documentation

Yeah, thanks for the review. I will raise one PR for this.

FP4 uses a model path of amd/MiniMax-M2.5-MXFP4, so should let ppl know that our model's path.

Have a good night.

benenzhu · 2026-04-20T10:15:34Z

vllm-project/recipes#300

functionstackx · 2026-04-20T10:26:27Z

Thanks @benenzhu ! Can u please merge this Kimi K2.5 mxfp4 recipe? It's been stuck unmerged for 3 weeks.

vllm-project/recipes#296

functionstackx · 2026-04-20T19:35:21Z

@chunfangamd it seems to fail
https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24661496941/job/72108418033?pr=827

functionstackx requested a review from a team March 1, 2026 04:14

functionstackx requested review from billishyahao and chunfangamd as code owners March 1, 2026 04:14

github-project-automation bot added this to InferenceMAX Board Mar 1, 2026

functionstackx added AMD sweep-enabled labels Mar 1, 2026

functionstackx changed the title ~~[AMD] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4)~~ [WIP] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Mar 1, 2026

functionstackx removed the sweep-enabled label Mar 1, 2026

functionstackx added the sweep-enabled label Mar 1, 2026

functionstackx removed the sweep-enabled label Mar 1, 2026

functionstackx marked this pull request as draft March 1, 2026 23:23

chunfangamd marked this pull request as ready for review March 4, 2026 09:09

chunfangamd approved these changes Mar 4, 2026

View reviewed changes

chunfangamd changed the title ~~[WIP] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4)~~ Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Mar 4, 2026

chunfangamd enabled auto-merge (squash) March 4, 2026 09:11

functionstackx changed the title ~~Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4)~~ [Do Not Merge] [WIP till AMD releases MXFP4 of MiniMax M2.5] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Mar 4, 2026

functionstackx changed the title ~~[Do Not Merge] [WIP till AMD releases MXFP4 of MiniMax M2.5] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4)~~ Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4) Mar 20, 2026

functionstackx added sweep-enabled and removed sweep-enabled labels Mar 20, 2026

functionstackx force-pushed the claude/issue-826-20260301-0409 branch 2 times, most recently from bd10495 to e849d65 Compare March 20, 2026 01:50

functionstackx added the sweep-enabled label Mar 20, 2026

functionstackx force-pushed the claude/issue-826-20260301-0409 branch 2 times, most recently from 86cc700 to b82116b Compare March 20, 2026 01:57

functionstackx removed the sweep-enabled label Mar 20, 2026

functionstackx force-pushed the claude/issue-826-20260301-0409 branch from b82116b to 7dd6063 Compare March 20, 2026 01:59

functionstackx added the sweep-enabled label Mar 20, 2026

change to official vllm images

1285b17

benenzhu added sweep-enabled and removed vllm/sglang release broken -need to wait labels Apr 18, 2026

benenzhu added 5 commits April 19, 2026 01:43

Merge branch 'main' into claude/issue-826-20260301-0409

16732d9

Implement retry logic for hf download command

995890a

Add error handling for model download failure

Merge branch 'main' into claude/issue-826-20260301-0409

f4337d0

Update download failure handling with sudo commands

5b61204

Use sudo for permission changes and directory removal.

change back

269c29f

Merge branch 'main' into claude/issue-826-20260301-0409

0c66c74

benenzhu added 2 commits April 20, 2026 12:30

Update perf-changelog.yaml

958fe2c

fix tp8

5167330

functionstackx changed the title ~~[AMD] [DNM, still merge in 0.18 as trust_remote_code=True is not passed to quark] Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4)~~ Minimax M2.5 MXFP4 benchmark for MI355x vLLM v0.19.1 (TP=2,4) Apr 20, 2026

benenzhu changed the title ~~Minimax M2.5 MXFP4 benchmark for MI355x vLLM v0.19.1 (TP=2,4)~~ Minimax M2.5 MXFP4 benchmark for MI355x vLLM v0.19.1 (TP=1,2,4) Apr 20, 2026

functionstackx disabled auto-merge April 20, 2026 10:22

functionstackx added sweep-enabled and removed sweep-enabled labels Apr 20, 2026

Conversation

functionstackx commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

functionstackx commented Mar 1, 2026

Uh oh!

Klaud-Cold commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Commented out TP=4 for MiniMax M2.1 MXFP4

Uh oh!

functionstackx commented Mar 1, 2026

Uh oh!

chunfangamd left a comment

Choose a reason for hiding this comment

Uh oh!

seungrokj commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

functionstackx commented Apr 20, 2026

Uh oh!

benenzhu commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

functionstackx commented Apr 20, 2026

Uh oh!

benenzhu commented Apr 20, 2026

Uh oh!

functionstackx commented Apr 20, 2026

Uh oh!

benenzhu commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

functionstackx commented Apr 20, 2026

Uh oh!

functionstackx commented Apr 20, 2026

Uh oh!

functionstackx commented Apr 20, 2026

Uh oh!

benenzhu commented Apr 20, 2026

Uh oh!

benenzhu commented Apr 20, 2026

Uh oh!

functionstackx commented Apr 20, 2026

Uh oh!

functionstackx commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

functionstackx commented Mar 1, 2026 •

edited

Loading

Klaud-Cold commented Mar 1, 2026 •

edited

Loading

seungrokj commented Apr 20, 2026 •

edited

Loading

benenzhu commented Apr 20, 2026 •

edited

Loading

benenzhu commented Apr 20, 2026 •

edited

Loading