Skip to content

Pinned Loading

  1. vllm vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 64.3k 11.7k

  2. llm-compressor llm-compressor Public

    Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

    Python 2.3k 296

  3. recipes recipes Public

    Common recipes to run vLLM

    Jupyter Notebook 249 94

Repositories

Showing 10 of 27 repositories
  • vllm-ascend Public

    Community maintained hardware plugin for vLLM on Ascend

    vllm-project/vllm-ascend’s past year of commit activity
    Python 1,399 Apache-2.0 618 772 (8 issues need help) 262 Updated Dec 1, 2025
  • vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    vllm-project/vllm’s past year of commit activity
    Python 64,345 Apache-2.0 11,674 1,904 (34 issues need help) 1,258 Updated Dec 1, 2025
  • speculators Public

    A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM

    vllm-project/speculators’s past year of commit activity
    Python 135 Apache-2.0 16 9 (6 issues need help) 17 Updated Dec 1, 2025
  • semantic-router Public

    Intelligent Router for Mixture-of-Models

    vllm-project/semantic-router’s past year of commit activity
    Rust 2,346 Apache-2.0 301 109 (25 issues need help) 38 Updated Dec 1, 2025
  • vllm-spyre Public

    Community maintained hardware plugin for vLLM on Spyre

    vllm-project/vllm-spyre’s past year of commit activity
    Python 37 Apache-2.0 29 4 18 Updated Dec 1, 2025
  • vllm-gaudi Public

    Community maintained hardware plugin for vLLM on Intel Gaudi

    vllm-project/vllm-gaudi’s past year of commit activity
    Python 18 Apache-2.0 73 1 70 Updated Dec 1, 2025
  • vllm-omni Public

    A high-throughput and memory efficient inference and serving engine for Omni-modality models

    vllm-project/vllm-omni’s past year of commit activity
    Python 109 Apache-2.0 21 18 (1 issue needs help) 13 Updated Dec 1, 2025
  • ci-infra Public

    This repo hosts code for vLLM CI & Performance Benchmark infrastructure.

    vllm-project/ci-infra’s past year of commit activity
    HCL 27 Apache-2.0 46 0 24 Updated Dec 1, 2025
  • vllm-project/vllm-project.github.io’s past year of commit activity
    JavaScript 24 45 0 2 Updated Dec 1, 2025
  • vllm-xpu-kernels Public

    The vLLM XPU kernels for Intel GPU

    vllm-project/vllm-xpu-kernels’s past year of commit activity
    C++ 11 Apache-2.0 14 0 9 Updated Dec 1, 2025