-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[XPU] Enable Expert parallel for MoE models
#28263
opened Nov 7, 2025 by
jikunshang
Loading…
5 tasks
[Misc][Model][Refactor] Pass the prefix into Linear layers
deepseek
Related to DeepSeek models
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
#28259
opened Nov 7, 2025 by
MengqingCao
Loading…
5 tasks
[GPUModelRunner] initialize_kv_cache cleanup (1/N): move initialization that doesn't depend on kv cache config to load_model
v1
#28258
opened Nov 7, 2025 by
heheda12345
Loading…
5 tasks
Fix issues from #28242
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
#28257
opened Nov 7, 2025 by
hmellor
Loading…
[Bug] Fix missing token_ids for reasoning parser models in chat completions #28246
frontend
#28256
opened Nov 7, 2025 by
baonudesifeizhai
Loading…
5 tasks
[Log] update shm wait time msg
ready
ONLY add when PR is ready to merge/full CI is needed
#28255
opened Nov 6, 2025 by
BoyuanFeng
Loading…
[BugFix] Fix DeepGEMM over-allocating workspace
#28254
opened Nov 6, 2025 by
LucasWilkinson
Loading…
[BugFix] Avoid calling KV connector layer APIs when metadata is unset
kv-connector
#28253
opened Nov 6, 2025 by
sdavidbd
Loading…
[CI/Build] add src folder to AMD test Dockerfile to fix python_only_compile
ci/build
rocm
Related to AMD ROCm
#28251
opened Nov 6, 2025 by
bradleyhd
Loading…
5 tasks
[Core] Rework handling of async scheduling config
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#28250
opened Nov 6, 2025 by
njhill
Loading…
[Perf] Use np.ndarray instead of list[list[int]] to reduce GC overhead
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#28245
opened Nov 6, 2025 by
Jialin
Loading…
3 of 5 tasks
Add truncate arg to yarn to match openai implementation of gpt-oss
gpt-oss
Related to GPT-OSS models
#28244
opened Nov 6, 2025 by
ashors1
Loading…
5 tasks
[BugFix][27485] Fix ITL algorithm for chunked OpenAI chat completions
performance
Performance-related issues
#28240
opened Nov 6, 2025 by
manamalani10
Loading…
[NVIDIA] [feat] Integrate flashinfer Trtllmgen bf16 moe
ci/build
documentation
Improvements or additions to documentation
qwen
Related to Qwen models
[Frontend][responsesAPI] convert responses API tool input to chat completions tool format
frontend
#28231
opened Nov 6, 2025 by
qandrew
Loading…
[Feature] Default Performance-related issues
ready
ONLY add when PR is ready to merge/full CI is needed
ignore_eos True for random dataset
performance
#28227
opened Nov 6, 2025 by
yewentao256
Loading…
[BugFix] [FEAT] Enable fastsafetensors for ROCm platform
ci/build
rocm
Related to AMD ROCm
#28225
opened Nov 6, 2025 by
tjtanaa
Loading…
5 tasks
Update
xgrammar version from 0.1.25 to 0.1.27
ci/build
#28221
opened Nov 6, 2025 by
cjackal
Loading…
1 task done
Adds Dockerfile arg for VLLM_PRECOMPILED_WHEEL_LOCATION
ci/build
#28217
opened Nov 6, 2025 by
dougbtv
Loading…
3 tasks done
Fix cu_num_generated_tokens slicing logic in LogprobsLists.slice() method
v1
#28214
opened Nov 6, 2025 by
usberkeley
Loading…
5 tasks done
Enhance Helm chart installation instructions
documentation
Improvements or additions to documentation
#28211
opened Nov 6, 2025 by
ccnmxns
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.