Skip to content

Update Laguna XS.2 serve command#596

Open
haic0 wants to merge 1 commit into
vllm-project:mainfrom
haic0:haic0/update-laguna-xs2-command
Open

Update Laguna XS.2 serve command#596
haic0 wants to merge 1 commit into
vllm-project:mainfrom
haic0:haic0/update-laguna-xs2-command

Conversation

@haic0

@haic0 haic0 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adjust Laguna XS.2 to the provided trust-remote-code and TP=8 launch command.
  • Aligns the recipe launch guidance with the provided vLLM serve command.

Test plan

  • Ran node scripts/build-recipes-api.mjs on the complete validated recipe update set.

Made with Cursor

Adjust Laguna XS.2 to the provided trust-remote-code and TP=8 launch command.

Signed-off-by: haic0 <haichzha@amd.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@vercel

vercel Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
vllm-recipes Ready Ready Preview, Comment Jun 29, 2026 1:53pm

Request Review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the configuration and deployment guide for the Laguna-XS.2 model to use an 8-GPU tensor parallel (TP=8) launch configuration. Feedback on these changes highlights that hardcoding tp: 8 in the strategy overrides will generate invalid launch commands for users selecting hardware profiles with fewer than 8 GPUs, especially since the model can fit on a single high-memory GPU. Additionally, it is recommended to revise the variant description to avoid referencing "the guide" directly, as this metadata is displayed in UI tooltips.

strategy_overrides:
single_node_tp:
tp: 1
tp: 8

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Hardcoding tp: 8 in strategy_overrides forces the tensor parallel size to 8 for all single-node TP deployments in the interactive command builder.

This will generate invalid launch commands (with --tensor-parallel-size 8) when a user selects a hardware profile with fewer than 8 GPUs (such as a single H200, which is the verified hardware for this model).

Since the model's weights (~66GB in BF16) can fit and run on a single high-memory GPU like an H200 (141GB) or H100 (80GB), consider keeping this at tp: 1 (or omitting it if the command builder supports auto-scaling) to allow the command builder to dynamically scale the TP size based on the selected hardware profile.

    tp: 1

precision: bf16
vram_minimum_gb: 80
description: "BF16 weights — fits on a single 80GB+ GPU (H100/H200/B200)"
description: "BF16 weights; the guide shows the requested TP=8 launch command"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Referencing "the guide" inside the variant's description metadata is a bit confusing for users interacting with the command builder UI, as this description is displayed directly in the UI tooltips/pills.

Consider keeping the description focused on the technical details of the variant, or mention the recommended TP size more generally.

    description: "BF16 weights — fits on a single 80GB+ GPU, TP=8 recommended for full context"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants