Skip to content

feat(vllm): add local vLLM OpenAI-server backend and GPU tests#235

Merged
prabhuteja12 merged 2 commits intomainfrom
vllm-modularization-migration
May 11, 2026
Merged

feat(vllm): add local vLLM OpenAI-server backend and GPU tests#235
prabhuteja12 merged 2 commits intomainfrom
vllm-modularization-migration

Conversation

@martinreinhardt01
Copy link
Copy Markdown
Collaborator

@martinreinhardt01 martinreinhardt01 commented May 7, 2026

Add local vLLM OpenAI-server backend and GPU tests

Introduce VLLMLocalServerOpenAIModel that launches vllm serve locally and routes eval requests through the existing OpenAIModel(base_url=...) HTTP client to enable a recordable interface boundary. Add session-scoped GPU tests covering batching/stop sequences/parameter overrides and document the new CLI usage.

PR Checklist

  • Use descriptive commit messages.
  • Provide tests for your changes.
  • Update any related documentation and include any relevant screenshots.
  • Check if changes need to be made to docs (README or any guides in /docs/).

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update

Description

Related Tickets & Documents

  • Related Issue #
  • Closes #

QA Instructions, Screenshots, Recordings

Please replace this line with instructions on how to test your changes, a note
on the hardware and config this has been tested on, as well as any relevant
additional information.

Added/updated tests?

  • Yes
  • No, and this is why: please replace this line with details on why tests
    have not been included
  • I need help with writing tests

[optional] Are there any post deployment tasks we need to perform?

@martinreinhardt01 martinreinhardt01 force-pushed the vllm-modularization-migration branch 6 times, most recently from a06bd98 to 43f55e2 Compare May 8, 2026 09:57
Comment thread src/eval_framework/llm/vllm_local_server.py Outdated
@martinreinhardt01 martinreinhardt01 force-pushed the vllm-modularization-migration branch from 43f55e2 to e6061b9 Compare May 11, 2026 13:13
Introduce VLLMLocalServerOpenAIModel that launches vllm serve locally and routes eval requests through the existing OpenAIModel(base_url=...) HTTP client to enable a recordable interface boundary. Add session-scoped GPU tests covering batching/stop sequences/parameter overrides and document the new CLI usage.
Make token counting optional: if tiktoken can’t map the model name, skip local tokenization and rely on API usage (or return None for sequence-position metadata). Avoid misleading fallback encodings, skip concat-compression when no encoder is available, and fail logprobs() explicitly without a local encoder.
@prabhuteja12 prabhuteja12 force-pushed the vllm-modularization-migration branch from e6061b9 to 80ab678 Compare May 11, 2026 13:15
@prabhuteja12 prabhuteja12 enabled auto-merge (squash) May 11, 2026 13:15
@prabhuteja12 prabhuteja12 merged commit f119c11 into main May 11, 2026
21 checks passed
@prabhuteja12 prabhuteja12 deleted the vllm-modularization-migration branch May 11, 2026 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants