-
Notifications
You must be signed in to change notification settings - Fork 217
[NV] perf: update MiniMax-M3 FP4 B300 vLLM #1990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+20
−14
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 The block comment just above this line still says the NVFP4 support 'is baked into the perf container image', but the PR swaps the image from the bespoke
vllm-minimax-m3-perf-*tag to a mainlinenightly-*tag. Update the comment to say the support has landed in vLLM main and is picked up from nightly so future readers don't chase a nonexistent perf image. Note: the same wording appears at lines 12839-12840 for the EAGLE3 variant (whose image is NOT changed in this PR), so that comment remains accurate and should be left as-is.Extended reasoning...
What the bug is
At
.github/configs/nvidia-master.yamllines 12805-12810, immediately above theminimaxm3-fp4-b300-vllmentry, there is a block comment that states:That wording was true when the config used the bespoke perf-tagged image
vllm/vllm-openai:vllm-minimax-m3-perf-x86_64-13.0.1-8b00f41. This PR replaces that tag withvllm/vllm-openai:nightly-93d8f834dd8acf33eb0e2a75b2711b628cb6e226— a mainline nightly image, not a perf-variant image. After the swap, the phrase "perf container image" no longer describes what is actually being pulled.Why the invariant still holds, but the comment is misleading
The underlying claim — "no runtime patch is needed" — is still true, because vllm-project/vllm PR #46380 has presumably landed in vLLM main and the nightly image picks it up. So there is no runtime effect and nothing breaks at benchmark time. The issue is purely that the comment cites the wrong reason: a future reader inspecting this entry will look for a "perf container image" that no longer exists in the config and be confused.
Step-by-step proof
image: vllm/vllm-openai:vllm-minimax-m3-perf-x86_64-13.0.1-8b00f41. The tag literally containsminimax-m3-perf, matching "perf container image".image: vllm/vllm-openai:nightly-93d8f834dd8acf33eb0e2a75b2711b628cb6e226. The tag isnightly-<sha>; there is no "perf" in the tag name.baked into the perf container image, unchanged by this PR.Fix
Update the comment to reflect that MiniMax-M3 NVFP4 support has landed upstream and is now picked up from the mainline nightly image, e.g.:
Scope note
The same wording appears at lines 12839-12840 above the EAGLE3 variant (
minimaxm3-fp4-eagle3-b300-vllm), which does not change its image in this PR. That comment remains accurate and should be left as-is; only the copy aboveminimaxm3-fp4-b300-vllm(lines 12805-12810) needs updating.Severity
This is a documentation-only inconsistency with no runtime effect on the benchmark. Marking as nit.