Skip to content

Update DeepSeek-V3.2-Exp AMD recipe YAML format#546

Open
haic0 wants to merge 1 commit into
vllm-project:mainfrom
haic0:haic0/replace-deepseek-v32-exp-yaml
Open

Update DeepSeek-V3.2-Exp AMD recipe YAML format#546
haic0 wants to merge 1 commit into
vllm-project:mainfrom
haic0:haic0/replace-deepseek-v32-exp-yaml

Conversation

@haic0

@haic0 haic0 commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

Test plan

  • node scripts/build-recipes-api.mjs
  • Parsed the updated YAML recipes and verified required top-level schema order against .claude/skills/add-recipe/SKILL.md.
  • Verified DCO sign-off as haic0.

Replaces #279.

Signed-off-by: haic0 <haic0@users.noreply.github.com>
@vercel

vercel Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
vllm-recipes Ready Ready Preview, Comment Jun 15, 2026 7:39am

Request Review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support and documentation for running DeepSeek-V3.2-Exp on AMD ROCm GPUs (MI300X, MI325X, and MI355X), including installation steps and serving commands. Feedback on these changes highlights that the --no-enable-prefix-caching flag is invalid in vLLM and should be removed, that DeepGEMM is CUDA-only and should be explicitly skipped in the AMD installation instructions, and that the overview section needs an update to reflect the newly added AMD ROCm support.

Comment on lines +142 to +148
vllm serve deepseek-ai/DeepSeek-V3.2-Exp \
--tensor-parallel-size 8 \
--max-num-batched-tokens 32768 \
--trust-remote-code \
--no-enable-prefix-caching \
--kv-cache-dtype bfloat16 \
--block-size 1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The --no-enable-prefix-caching flag is not a valid vLLM CLI argument. Since vLLM uses standard argparse for boolean flags like --enable-prefix-caching, passing --no-enable-prefix-caching will result in an unrecognized arguments error and prevent the server from starting.

Since prefix caching is disabled by default, you can simply omit this flag.

Note: Please also remove "--no-enable-prefix-caching" from the hardware_overrides.amd.extra_args list on line 76 of this file to prevent similar runtime errors when the recipe is parsed.

  vllm serve deepseek-ai/DeepSeek-V3.2-Exp \
    --tensor-parallel-size 8 \
    --max-num-batched-tokens 32768 \
    --trust-remote-code \
    --kv-cache-dtype bfloat16 \
    --block-size 1

Comment on lines +107 to +113
AMD ROCm wheel:

```bash
uv venv --python 3.12
source .venv/bin/activate
uv pip install vllm --extra-index-url https://wheels.vllm.ai/rocm/
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

DeepGEMM is a CUDA-only library optimized specifically for NVIDIA Hopper and Blackwell architectures. It does not support AMD ROCm and will fail to compile/install on ROCm systems.

To prevent AMD users from running the NVIDIA-specific installation steps (which include installing DeepGEMM), we should explicitly separate the AMD ROCm installation instructions and advise them to skip the DeepGEMM step.

Consider updating this section to:

  ### AMD ROCm Installation

  For AMD ROCm, do not install DeepGEMM (which is CUDA-only). Instead, install the ROCm-compatible vLLM wheel directly:

  ```bash
  uv venv --python 3.12
  source .venv/bin/activate
  uv pip install vllm --extra-index-url https://wheels.vllm.ai/rocm/

## Prerequisites

- **Hardware**: 8x H200 (or H20, or 8xB200) GPUs
- **AMD ROCm**: 8x MI300X / MI325X / MI355X GPUs

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

With the addition of AMD ROCm support for MI300X, MI325X, and MI355X, the statement in the Overview section (lines 91-92) which says:
Only Hopper and Blackwell data center GPUs are supported for now.
is now outdated and contradictory.

Please update the Overview section to reflect that AMD ROCm GPUs are also supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant