Update Laguna XS.2 serve command#596
Conversation
Adjust Laguna XS.2 to the provided trust-remote-code and TP=8 launch command. Signed-off-by: haic0 <haichzha@amd.com> Co-authored-by: Cursor <cursoragent@cursor.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Code Review
This pull request updates the configuration and deployment guide for the Laguna-XS.2 model to use an 8-GPU tensor parallel (TP=8) launch configuration. Feedback on these changes highlights that hardcoding tp: 8 in the strategy overrides will generate invalid launch commands for users selecting hardware profiles with fewer than 8 GPUs, especially since the model can fit on a single high-memory GPU. Additionally, it is recommended to revise the variant description to avoid referencing "the guide" directly, as this metadata is displayed in UI tooltips.
| strategy_overrides: | ||
| single_node_tp: | ||
| tp: 1 | ||
| tp: 8 |
There was a problem hiding this comment.
Hardcoding tp: 8 in strategy_overrides forces the tensor parallel size to 8 for all single-node TP deployments in the interactive command builder.
This will generate invalid launch commands (with --tensor-parallel-size 8) when a user selects a hardware profile with fewer than 8 GPUs (such as a single H200, which is the verified hardware for this model).
Since the model's weights (~66GB in BF16) can fit and run on a single high-memory GPU like an H200 (141GB) or H100 (80GB), consider keeping this at tp: 1 (or omitting it if the command builder supports auto-scaling) to allow the command builder to dynamically scale the TP size based on the selected hardware profile.
tp: 1| precision: bf16 | ||
| vram_minimum_gb: 80 | ||
| description: "BF16 weights — fits on a single 80GB+ GPU (H100/H200/B200)" | ||
| description: "BF16 weights; the guide shows the requested TP=8 launch command" |
There was a problem hiding this comment.
Referencing "the guide" inside the variant's description metadata is a bit confusing for users interacting with the command builder UI, as this description is displayed directly in the UI tooltips/pills.
Consider keeping the description focused on the technical details of the variant, or mention the recommended TP size more generally.
description: "BF16 weights — fits on a single 80GB+ GPU, TP=8 recommended for full context"
Summary
Test plan
node scripts/build-recipes-api.mjson the complete validated recipe update set.Made with Cursor