Update Trinity Large Thinking ROCm command by haic0 · Pull Request #593 · vllm-project/recipes

haic0 · 2026-06-29T13:42:35Z

Summary

Add the ROCm env, trust-remote-code, TP=8, and max-model-len 32768 launch settings for Trinity Large Thinking.
Aligns the recipe launch guidance with the provided vLLM serve command.

Test plan

Ran node scripts/build-recipes-api.mjs on the complete validated recipe update set.

Made with Cursor

Add the ROCm env, trust-remote-code, TP=8, and max-model-len 32768 launch settings for Trinity Large Thinking. Signed-off-by: haic0 <haichzha@amd.com> Co-authored-by: Cursor <cursoragent@cursor.com>

vercel · 2026-06-29T13:42:40Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
vllm-recipes	Ready	Preview, Comment	Jun 29, 2026 1:49pm

gemini-code-assist

Code Review

This pull request updates the configuration for models/arcee-ai/Trinity-Large-Thinking.yaml by adding --trust-remote-code and --max-model-len 32768 to the base arguments, introducing AMD hardware overrides, and specifying a tensor parallel size of 8 for single-node strategy overrides. The guide's launch commands and descriptions were also updated to reflect these changes. The reviewer feedback suggests keeping the launch commands and documentation generic by removing AMD-specific environment variables and references, as the platform automatically handles AMD-specific environment variables and the configuration is also compatible with NVIDIA hardware.

gemini-code-assist · 2026-06-29T13:43:27Z

+  VLLM_ROCM_USE_AITER=1 vllm serve arcee-ai/Trinity-Large-Thinking \
+    --trust-remote-code \
+    --tensor-parallel-size 8 \
+    --max-model-len 32768


Since VLLM_ROCM_USE_AITER: "1" is already defined under hardware_overrides.amd.extra_env, the deployment platform will automatically inject this environment variable when running on AMD hardware. Hardcoding it in the generic launch command can be confusing for NVIDIA users (especially those deploying the nvfp4 variant on Blackwell GPUs). It is cleaner to keep the launch command generic.

vllm serve arcee-ai/Trinity-Large-Thinking \ --trust-remote-code \ --tensor-parallel-size 8 \ --max-model-len 32768

gemini-code-assist · 2026-06-29T13:43:27Z

+  VLLM_ROCM_USE_AITER=1 vllm serve arcee-ai/Trinity-Large-Thinking \
+    --trust-remote-code \
+    --tensor-parallel-size 8 \
+    --max-model-len 32768 \


Similarly, we should remove the AMD-specific VLLM_ROCM_USE_AITER=1 prefix from the optional parser flags launch command to keep it generic and avoid confusion for NVIDIA users.

vllm serve arcee-ai/Trinity-Large-Thinking \ --trust-remote-code \ --tensor-parallel-size 8 \ --max-model-len 32768 \

gemini-code-assist · 2026-06-29T13:43:27Z

  - `--enable-auto-tool-choice` lets the model decide when to call tools.
  - `--tool-call-parser qwen3_coder` converts tool calls into OpenAI-style `tool_calls`.
-  - `--dtype bfloat16` matches the recommended serving dtype.
+  - `--max-model-len 32768` keeps the KV cache practical for the TP=8 AMD launch.


Since this recipe is also compatible with NVIDIA hardware (such as the nvfp4 variant), the KV cache limitation is practical for any TP=8 launch, not just AMD. We should make this description more general.

- `--max-model-len 32768` keeps the KV cache practical for the TP=8 launch.

Update Trinity Large Thinking ROCm command

ce5490f

Add the ROCm env, trust-remote-code, TP=8, and max-model-len 32768 launch settings for Trinity Large Thinking. Signed-off-by: haic0 <haichzha@amd.com> Co-authored-by: Cursor <cursoragent@cursor.com>

gemini-code-assist Bot reviewed Jun 29, 2026

View reviewed changes

vercel Bot deployed to Preview June 29, 2026 13:49 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update Trinity Large Thinking ROCm command#593

Update Trinity Large Thinking ROCm command#593
haic0 wants to merge 1 commit into
vllm-project:mainfrom
haic0:haic0/update-trinity-large-thinking-rocm-command

haic0 commented Jun 29, 2026

Uh oh!

vercel Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 29, 2026

Uh oh!

gemini-code-assist Bot Jun 29, 2026

Uh oh!

gemini-code-assist Bot Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

haic0 commented Jun 29, 2026

Summary

Test plan

Uh oh!

vercel Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented Jun 29, 2026 •

edited

Loading