vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 10.6k
Star 60.4k

Code
Issues 1.8k
Pull requests 1.2k
Discussions
Actions
Projects 14
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: vllm-project/vllm

Labels 70 Milestones 3

New pull request New

1,178 Open 13,909 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix incorrect string formatting in barrier timeout exceptions

#27149 opened Oct 18, 2025 by hyongtao-code

Loading…

4 tasks

[Bugfix][Core] Fix xgrammar import failure on unsupported platforms structured-output v1

#27148 opened Oct 18, 2025 by ihb2032

Loading…

[MM Encoder]: Refactor mm encoder attention interface and support attention mask

#27147 opened Oct 18, 2025 by Isotr0py • Draft

1 of 5 tasks

[torch.compile] Enable silu_mul_fp8_quant fusion without custom ops enabled

#27146 opened Oct 18, 2025 by ZJY0516

Loading…

5 tasks

[Model][3/N] Improve all pooling task | Support chunked prefill with ALL pooling frontend v1

#27145 opened Oct 18, 2025 by noooop

Loading…

5 tasks

[Bugfix] fixes the decoding metadata of dense mla's fp8 kvcache. ci/build v1

#27144 opened Oct 18, 2025 by sighingnow

Loading…

[Core] Remove V0 executors frontend tpu

Related to Google TPUs

#27142 opened Oct 18, 2025 by njhill • Draft

[Core] CuteDSL MoE with Nvfp4 DeepEP dispatch ci/build v1

#27141 opened Oct 18, 2025 by wenscarl • Draft

5 tasks

[NIXL] use Host buffer to support TP_ratio > 1 for XPU kv-connector

#27140 opened Oct 18, 2025 by xuechendi

Loading…

5 tasks

[BugFix] fix graph partition signature

#27139 opened Oct 18, 2025 by BoyuanFeng

Loading…

Use pydantic validation in speculative.py config

#27137 opened Oct 18, 2025 by Navya1707

Loading…

[Fix][Spec Decode] Fix llama4 draft loading with different quantization llama

Related to Llama models

speculative-decoding

#27136 opened Oct 18, 2025 by linzebing

Loading…

3 of 5 tasks

feat: enable FlashInfer FP8 Blockscale on SM90

#27134 opened Oct 18, 2025 by djmmoss • Draft

1 of 3 tasks

[Bugfix] Fix incorrect kv cache metrics in grafana.json documentation

Improvements or additions to documentation

#27133 opened Oct 17, 2025 by fangpings

Loading…

5 tasks

Early exit for MoE LoRA kernels ci/build deepseek

Related to DeepSeek models

gpt-oss

Related to GPT-OSS models

needs-rebase qwen

Related to Qwen models

#27131 opened Oct 17, 2025 by gnovack • Draft

5 tasks

[Minor] Add some clarifying comments to recent changes ready

ONLY add when PR is ready to merge/full CI is needed

#27130 opened Oct 17, 2025 by njhill

Loading…

[BugFix] bugfix for Flash Attention MLA with full cuda graph IMA following pr-25490 ready

ONLY add when PR is ready to merge/full CI is needed

#27128 opened Oct 17, 2025 by Daisy-Ma-coder

Loading…

[compile] Enable sequence parallelism matching w/o custom ops enabled

#27126 opened Oct 17, 2025 by angelayi

Loading…

make flash_attn ViT upgrade opt-in ci/build ci-failure

Issue about an unexpected test failure in CI

qwen

Related to Qwen models

rocm

Related to AMD ROCm

#27124 opened Oct 17, 2025 by bradleyhd

Loading…

[Kernels] Swap quant method needs-rebase

#27123 opened Oct 17, 2025 by bnellnm

Loading…

[BugFix] Disable fp8 kv-cache by default for DeepSeek V3.2 deepseek

Related to DeepSeek models

ready

ONLY add when PR is ready to merge/full CI is needed

#27121 opened Oct 17, 2025 by LucasWilkinson

Loading…

v0.11.1

[Bugfix] Fix allocation & free logic of SingleWriterShmRingBuffer

#27117 opened Oct 17, 2025 by imkero

Loading…

5 tasks

[CI/Build]Add eval config for Qwen3-235B-A22B-Thinking-2507-FP8 and Qwen3-8B ci/build qwen

Related to Qwen models

#27113 opened Oct 17, 2025 by hl475 • Draft

5 tasks

[BugFix] Fix failing gemma-3-1b-it test: test_lm_eval_accuracy_v1_engine[google/gemma-3-1b-it] ci/build

#27111 opened Oct 17, 2025 by LucasWilkinson

Loading…

Add missing opentelemetry dependency to base docker image ci/build

#27109 opened Oct 17, 2025 by Aymendje

Loading…

3 of 5 tasks

Previous 1 2 3 4 5 … 47 48 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-10-15.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!