Skip to content

[Do Not Merge] Upgrade Kimi-K2.5-INT4-MI355X-vLLM image to upstream daily image bcc2306cefa4179c548d3e638e7a22a88d281733#1066

Open
chunfangamd wants to merge 4 commits intomainfrom
chun/kimi-k2.5-int4
Open

[Do Not Merge] Upgrade Kimi-K2.5-INT4-MI355X-vLLM image to upstream daily image bcc2306cefa4179c548d3e638e7a22a88d281733#1066
chunfangamd wants to merge 4 commits intomainfrom
chun/kimi-k2.5-int4

Conversation

@chunfangamd
Copy link
Copy Markdown
Collaborator

@chunfangamd chunfangamd commented Apr 17, 2026

  • Upgrade the Kimi-K2.5-INT4-MI355X-vLLM image to upstream daily image vllm/vllm-openai-rocm:nightly-bcc2306cefa4179c548d3e638e7a22a88d281733
  • Reduce --gpu-memory-utilization from 0.95 to 0.88
  • Remove --mm-encoder-tp-mode data flag

e2e Test: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24568914678

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward image upgrade with tuning adjustments.

Extended reasoning...

Overview

This PR modifies two files: the AMD master config YAML to update the Docker image tag for kimik2.5-int4-mi355x-vllm from v0.18.0 to a pinned nightly build, and the corresponding benchmark shell script to reduce --gpu-memory-utilization from 0.95 to 0.88 and remove the --mm-encoder-tp-mode data flag.

Security risks

None. No auth, crypto, or permissions code is touched. The image reference uses a specific commit hash, which is a good practice for reproducibility.

Level of scrutiny

Low. This is a mechanical benchmark config update — image version bump plus parameter tuning — following the same patterns used throughout the rest of amd-master.yaml. The GPU memory utilization change and flag removal are typical compatibility adjustments when upgrading to a new image version.

Other factors

No bugs were found by the automated system. The change is small (3 effective lines), self-contained, and consistent with how other entries in this file are structured. The nightly image is pinned to a specific commit hash, reducing reproducibility risk.

@chunfangamd chunfangamd changed the title Upgrade Kimi-K2.5-INT4-MI355X-vLLM image to upstream daily image bcc2306cefa4179c548d3e638e7a22a88d281733 [Do Not Merge] Upgrade Kimi-K2.5-INT4-MI355X-vLLM image to upstream daily image bcc2306cefa4179c548d3e638e7a22a88d281733 Apr 17, 2026

kimik2.5-int4-mi355x-vllm:
image: vllm/vllm-openai-rocm:v0.18.0
image: vllm/vllm-openai-rocm:nightly-bcc2306cefa4179c548d3e638e7a22a88d281733
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do u have release image that is functional?

@functionstackx
Copy link
Copy Markdown
Contributor

@chunfangamd there is still an perf regression

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants