Skip to content

Conversation

RobMulla
Copy link

This PR incorporates several documentation improvements based on the excellent feedback and findings from the recent vLLM on TPU bug bash. The goal is to make the recipes clearer and more user-friendly, especially for those new to the platform.

Key changes include:

  • Fixes Qwen2.5-32B Recipe: Updated the recipe to use the correct vllm bench serve benchmark command and fixed the step numbering and naming for clarity.
  • Adds TPU Sizing Guide: Added a new, more detailed "Choosing the Right TPU Configuration" table to the main vLLM/README.md. This version includes an explicit legend (✅, ⚠️, ❌) and more nuanced notes on topology and precision.
  • Clarifies docker exec Usage: Added a note to the recipes explaining that using docker exec is the intended path for installing benchmark dependencies and running tests inside the container.
  • Standardizes Log Examples: Updated the example Uvicorn log output in the recipes to match the (APIServer pid=...) format that users will see.

- Update Qwen2.5-32B recipe with correct benchmark commands and step numbering.
- Add a TPU/model sizing guide to the main vLLM README.
- Clarify the purpose of 'docker exec' in all recipes.
- Standardize the example log output format.
Copy link

google-cla bot commented Oct 10, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@RobMulla RobMulla marked this pull request as draft October 13, 2025 19:14
@RobMulla RobMulla marked this pull request as ready for review October 13, 2025 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant