[ROCm] Enable FlyDSL w4a16 MoE for Kimi INT4#552
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Code Review
This pull request updates the ROCm deployment configuration for Kimi-K2.5 in the markdown documentation by removing obsolete environment variables and adding new CLI arguments (--moe-backend flydsl and --compilation-config). The review feedback points out that these changes should be consistently applied to other deployment configurations, such as the Docker run command and YAML configuration files. Additionally, a duplicated --mm-encoder-tp-mode data argument was identified in the updated command block.
I am having trouble creating individual review comments. Click here to see my feedback.
moonshotai/Kimi-K2.5.md (242-243)
The environment variables VLLM_ROCM_QUICK_REDUCE_QUANTIZATION and VLLM_ROCM_USE_AITER_RMSNORM have been removed here, but they are still present in the AMD (ROCm) Docker run command (lines 105-106) and the YAML configuration (models/moonshotai/Kimi-K2.5.yaml lines 103-104). To maintain consistency across all deployment methods, please update those sections as well.
moonshotai/Kimi-K2.5.md (250-251)
The new arguments --moe-backend flydsl and --compilation-config are added here, but they are missing from the AMD (ROCm) Docker run command (lines 108-118) and the YAML configuration (models/moonshotai/Kimi-K2.5.yaml). Please update those configurations to enable the FlyDSL MoE backend consistently. Additionally, the --mm-encoder-tp-mode data argument is duplicated in this command block (on line 244 and line 249). One of them should be removed.
Replace default triton w4a16 MoE kernel with more performant FlyDSL implementation for Kimi INT4 MI355X