Skip to content

Conversation

LucasWilkinson
Copy link
Collaborator

@LucasWilkinson LucasWilkinson commented Oct 17, 2025

It appears the FP8 kv-cach is causing issues:

  1. Accuracy issues reported with FULL_AND_PIECEWISE cudagraphs (default cudagraph mode)
  2. DP/EP [Bug]: FlashMLA: invalid configuration argument #27043 (NOTE: [Bug]: DeepSeek v3.2 hits IMA on DP/EP setup #26605 is still outstanding)

This is hopefully a temporary thing while we stabilize the fp8 kv-cache (which is the intended way to use the model); #26152 should help with this

Signed-off-by: Lucas Wilkinson <[email protected]>
@mergify mergify bot added the deepseek Related to DeepSeek models label Oct 17, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly disables the FP8 KV cache by default for DeepSeek V3.2 to address reported accuracy and stability issues. The change is simple, targeted, and effectively makes the feature opt-in. I have one suggestion to improve the clarity of a comment to enhance future maintainability.

@LucasWilkinson LucasWilkinson added this to the v0.11.1 milestone Oct 17, 2025
LucasWilkinson and others added 2 commits October 17, 2025 16:18
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Lucas Wilkinson <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
@LucasWilkinson LucasWilkinson enabled auto-merge (squash) October 17, 2025 23:02
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 17, 2025
@LucasWilkinson LucasWilkinson merged commit c2bba69 into vllm-project:main Oct 18, 2025
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepseek Related to DeepSeek models ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants