-
Notifications
You must be signed in to change notification settings - Fork 215
Pull requests: SemiAnalysisAI/InferenceX
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[WIP] perf: update MiniMax-M3 FP4 B300 vLLM
full-sweep-fail-fast
#1990
opened Jul 2, 2026 by
anish-shanbhag
Collaborator
Loading…
perf(dsv4-fp4-mi355x-vllm): use AITER a16w4 MoE backend (+21% decode)
full-sweep-enabled
#1989
opened Jul 2, 2026 by
jiacao-amd
Collaborator
•
Draft
[WIP] [do not merge] Add MiniMax-M3 FP4 B200 Dynamo-vLLM disagg config
full-sweep-fail-fast-no-canary
Full sweep, no canary gate; first failure in a matrix cancels that matrix
#1982
opened Jul 2, 2026 by
jasonlizhengjian
Collaborator
Loading…
[AMD] DeepSeek-V4 FP4 MI355X vLLM MTP: bump image to latest nightly
full-sweep-fail-fast
#1981
opened Jul 2, 2026 by
Fangzhou-Ai
Collaborator
Loading…
[AMD] DeepSeek-V4 FP4 MI355X vLLM STP: bump image to latest nightly
full-sweep-fail-fast
#1980
opened Jul 2, 2026 by
Fangzhou-Ai
Collaborator
Loading…
[AMD] MiniMax-M3 FP4 MI355X vLLM MTP: close gap vs ATOM (INT4 all-reduce + index-sharing)
full-sweep-fail-fast
#1979
opened Jul 2, 2026 by
Fangzhou-Ai
Collaborator
Loading…
[WIP] Update Minimax M3 FP4 vllm
full-sweep-enabled
#1978
opened Jul 2, 2026 by
wzhao18
Collaborator
Loading…
[AMD] MiniMax-M3 FP4 MI355X vLLM STP: close gap vs ATOM (INT4 all-reduce + index-sharing)
full-sweep-fail-fast
#1969
opened Jul 1, 2026 by
Fangzhou-Ai
Collaborator
Loading…
test the GB300 cluster after the node patch
full-sweep-enabled
#1961
opened Jun 30, 2026 by
richardhuo-nv
Collaborator
Loading…
chore(deps): bump the github-actions group across 1 directory with 2 updates
dependencies
Pull requests that update a dependency file
github_actions
Pull requests that update GitHub Actions code
#1960
opened Jun 30, 2026 by
dependabot
Bot
Loading…
[Klaud Cold] [AMD] Enable AITER MoE for MiniMax-M3 MI355X FP4 vLLM MTP benchmark
full-sweep-fail-fast
#1958
opened Jun 30, 2026 by
functionstackx
Collaborator
Loading…
Update Qwen3.5 FP4 MI355X MTP recipe with tuned env/flags
#1957
opened Jun 29, 2026 by
amd-fuyuajin
Collaborator
Loading…
[merging June 30 at 4pm PT] making this an hard guideline & enforcing consistent reviews on upstream sglang/vllm docker repo to PR CheckList
#1956
opened Jun 29, 2026 by
functionstackx
Collaborator
Loading…
[AMD] Enable AITER MoE for MiniMax-M3 MI355X vLLM MTP benchmarks
#1955
opened Jun 29, 2026 by
Fangzhou-Ai
Collaborator
•
Draft
2 of 3 tasks
[AMD] Tune MiniMax-M3 MXFP8 MI300X vLLM: async scheduling + big-prefill, fix conc256 EP8→EP1
full-sweep-enabled
#1951
opened Jun 29, 2026 by
ZhengGong-amd
Collaborator
Loading…
7 of 8 tasks
[AMD] Update MiniMax-M3-MXFP4 MI355X vLLM disagg perf and config
full-sweep-enabled
#1943
opened Jun 26, 2026 by
Duyi-Wang
Collaborator
Loading…
[AMD] Add MiniMax-M3-FP4 MI355X ATOMESH update 0623
AMD
evals-only
Suppress throughput and run only eval jobs; combine with all-evals to expand selection
#1940
opened Jun 26, 2026 by
seungrokj
Collaborator
Loading…
8 tasks
[AMD] Add MiniMax-M3-FP8 MI355X ATOMESH update 0623
AMD
evals-only
Suppress throughput and run only eval jobs; combine with all-evals to expand selection
#1930
opened Jun 25, 2026 by
seungrokj
Collaborator
Loading…
8 tasks
Add GLM-5-FP8 GB300 multinode dynamo-sglang MTP benchmark
full-sweep-enabled
#1907
opened Jun 23, 2026 by
hshrivastava-droid
Collaborator
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.