Skip to content

[ROCm] Add MI355X-only MiniMax-M3 MXFP4 variant#580

Merged
ywang96 merged 5 commits into
vllm-project:mainfrom
functionstackx:codex/add-minimax-m3-mxfp4
Jun 25, 2026
Merged

[ROCm] Add MI355X-only MiniMax-M3 MXFP4 variant#580
ywang96 merged 5 commits into
vllm-project:mainfrom
functionstackx:codex/add-minimax-m3-mxfp4

Conversation

@functionstackx

@functionstackx functionstackx commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add amd/MiniMax-M3-MXFP4 as an MXFP4 variant of the existing MiniMaxAI/MiniMax-M3 recipe
  • use the ROCm nightly image and the validated TP8/encoder settings on MI355X
  • add variant-level hardware allowlists and exact hardware overrides so MXFP4 is unavailable everywhere except MI355X
  • remove stale generated hardware artifacts when compatibility shrinks

Why

The AMD Quark MXFP4 checkpoint is currently supported only on MI355X. Treating MXFP4 as a generally selectable precision produced invalid commands for NVIDIA and older AMD hardware.

Local verification

based off of vllm-project/vllm#45794

accuracy gsm8k & perf verfieid https://github.com/SemiAnalysisAI/InferenceX/actions/runs/28195297568/job/83520506068?pr=1935

SemiAnalysisAI/InferenceX#1935
image

User impact

Selecting MXFP4 now selects MI355X automatically. On every other hardware profile, the MXFP4 pill is disabled with an MI355X-only explanation. Generated API data for the promoted MXFP4 checkpoint exposes only MI355X.

Validation

  • node scripts/build-recipes-api.mjs — 142 models, 116 promoted variants
  • node --check src/lib/command-synthesis.js
  • node --check scripts/build-recipes-api.mjs
  • verified in the local browser that MXFP4 is disabled on B200
  • verified an invalid B200 + MXFP4 URL normalizes to MI355X
  • verified the MXFP4 API hardware index and generated files contain only mi355x

View with Codesmith Autofix with Codesmith
Need help on this PR? Tag /codesmith with what you need. Autofix is disabled.

@vercel

vercel Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
vllm-recipes Ready Ready Preview, Comment Jun 25, 2026 8:11pm

Request Review

Signed-off-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
Signed-off-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
Signed-off-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
Signed-off-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
Signed-off-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
@functionstackx functionstackx force-pushed the codex/add-minimax-m3-mxfp4 branch from 2d841ac to 190b98e Compare June 25, 2026 20:08

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for variant-level hardware allowlists (e.g., supported_hardware) and hardware-specific overrides (such as docker_image, extra_args, and extra_env). It also adds a new mxfp4 variant for the MiniMax-M3 model targeting AMD Instinct MI355X hardware. The review feedback highlights a potential runtime crash when handling invalid hardware IDs in the URL query parameters, and suggests improving consistency in the UI by using the variant's label instead of just its precision when displaying disabled hardware reasons.

Comment on lines +353 to +356
const requestedHwAllowed = requestedHwId
&& isPrecisionCompatible(requestedHwProfile, requestedVariant)
&& isHardwareSupported(recipe, requestedHwId)
&& isVariantHardwareSupported(requestedVariant, requestedHwId);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If requestedHwId is an invalid hardware profile ID (e.g., passed via URL query parameters), taxonomy.hardware_profiles?.[requestedHwId] will be undefined. Since requestedHwId is truthy, requestedHwAllowed will proceed to evaluate isPrecisionCompatible(requestedHwProfile, requestedVariant). If the variant has a precision constraint, matchesConstraint will attempt to access profile.brand on undefined (or {} which doesn't crash but returns true if constraint is undefined, leading to hwId being set to an invalid ID). This can cause runtime crashes or broken UI state. We should explicitly verify that requestedHwId exists in taxonomy.hardware_profiles before allowing it.

Suggested change
const requestedHwAllowed = requestedHwId
&& isPrecisionCompatible(requestedHwProfile, requestedVariant)
&& isHardwareSupported(recipe, requestedHwId)
&& isVariantHardwareSupported(requestedVariant, requestedHwId);
const requestedHwAllowed = requestedHwId
&& !!taxonomy.hardware_profiles?.[requestedHwId]
&& isPrecisionCompatible(requestedHwProfile, requestedVariant)
&& isHardwareSupported(recipe, requestedHwId)
&& isVariantHardwareSupported(requestedVariant, requestedHwId);

Comment on lines +1248 to 1251
const reason = !variantHardwareOk
? `${currentVariant.precision?.toUpperCase()} is only supported on ${(currentVariant.supported_hardware || []).map((hw) => taxonomy.hardware_profiles?.[hw]?.display_name || hw).join(", ")}`
: !precisionOk
? `${currentVariant.precision?.toUpperCase()} requires NVIDIA Blackwell`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with other parts of the component (such as line 1516) and to support custom variant labels correctly, we should use (currentVariant.label || currentVariant.precision) instead of just currentVariant.precision when rendering the error message.

Suggested change
const reason = !variantHardwareOk
? `${currentVariant.precision?.toUpperCase()} is only supported on ${(currentVariant.supported_hardware || []).map((hw) => taxonomy.hardware_profiles?.[hw]?.display_name || hw).join(", ")}`
: !precisionOk
? `${currentVariant.precision?.toUpperCase()} requires NVIDIA Blackwell`
const reason = !variantHardwareOk
? `${(currentVariant.label || currentVariant.precision)?.toUpperCase()} is only supported on ${(currentVariant.supported_hardware || []).map((hw) => taxonomy.hardware_profiles?.[hw]?.display_name || hw).join(", ")}`
: !precisionOk
? `${(currentVariant.label || currentVariant.precision)?.toUpperCase()} requires NVIDIA Blackwell`

Comment on lines +1458 to 1461
const reason = !variantHardwareOk
? `${currentVariant.precision?.toUpperCase()} is only supported on ${(currentVariant.supported_hardware || []).map((hw) => taxonomy.hardware_profiles?.[hw]?.display_name || hw).join(", ")}`
: !precisionOk
? `${currentVariant.precision?.toUpperCase()} requires NVIDIA Blackwell`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with other parts of the component (such as line 1516) and to support custom variant labels correctly, we should use (currentVariant.label || currentVariant.precision) instead of just currentVariant.precision when rendering the error message.

Suggested change
const reason = !variantHardwareOk
? `${currentVariant.precision?.toUpperCase()} is only supported on ${(currentVariant.supported_hardware || []).map((hw) => taxonomy.hardware_profiles?.[hw]?.display_name || hw).join(", ")}`
: !precisionOk
? `${currentVariant.precision?.toUpperCase()} requires NVIDIA Blackwell`
const reason = !variantHardwareOk
? `${(currentVariant.label || currentVariant.precision)?.toUpperCase()} is only supported on ${(currentVariant.supported_hardware || []).map((hw) => taxonomy.hardware_profiles?.[hw]?.display_name || hw).join(", ")}`
: !precisionOk
? `${(currentVariant.label || currentVariant.precision)?.toUpperCase()} requires NVIDIA Blackwell`

@functionstackx functionstackx marked this pull request as ready for review June 25, 2026 20:10
@hongxiayang

Copy link
Copy Markdown
Contributor

suggest to update the subject of PR: replace [codex] with [AMD] or [ROCm]

@functionstackx functionstackx changed the title [codex] Add MI355X-only MiniMax-M3 MXFP4 variant Add MI355X-only MiniMax-M3 MXFP4 variant Jun 25, 2026
@functionstackx functionstackx changed the title Add MI355X-only MiniMax-M3 MXFP4 variant [ROCm] Add MI355X-only MiniMax-M3 MXFP4 variant Jun 25, 2026
@ywang96 ywang96 merged commit e2f883e into vllm-project:main Jun 25, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants