Docs: Update recipes for vLLM trillium with feedback. #96
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR incorporates several documentation improvements based on the excellent feedback and findings from the recent vLLM on TPU bug bash. The goal is to make the recipes clearer and more user-friendly, especially for those new to the platform.
Key changes include:
Qwen2.5-32B
Recipe: Updated the recipe to use the correct vllm bench serve benchmark command and fixed the step numbering and naming for clarity.docker exec
Usage: Added a note to the recipes explaining that using docker exec is the intended path for installing benchmark dependencies and running tests inside the container.