Skip to content

Conversation

@jiahanc
Copy link
Contributor

@jiahanc jiahanc commented Nov 6, 2025

Purpose

Integrate flashinfer trtllm-gen BF16 moe to supported models
Blocked by waiting new flashinfer release

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify
Copy link

mergify bot commented Nov 6, 2025

Documentation preview: https://vllm--28238.org.readthedocs.build/en/28238/

@mergify mergify bot added documentation Improvements or additions to documentation ci/build qwen Related to Qwen models labels Nov 6, 2025
@jiahanc jiahanc force-pushed the trtllmgen-bf16-moe branch 3 times, most recently from 951f1a2 to 9f7ea6c Compare November 7, 2025 19:04
@jiahanc jiahanc changed the title [NVIDIA] [feat] Integrate flashinfer Trtllmgen bf16 moe [NVIDIA] [feat] Integrate flashinfer Trtllmgen bf16 moe and refactor trtllm-gen moe launcher Nov 7, 2025
@jiahanc jiahanc changed the title [NVIDIA] [feat] Integrate flashinfer Trtllmgen bf16 moe and refactor trtllm-gen moe launcher [NVIDIA] [feat] Integrate flashinfer Trtllmgen bf16 moe Nov 7, 2025
@mergify
Copy link

mergify bot commented Nov 8, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @jiahanc.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Nov 8, 2025
Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>
Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>
Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>
@mergify
Copy link

mergify bot commented Nov 11, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @jiahanc.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Nov 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build documentation Improvements or additions to documentation needs-rebase nvidia qwen Related to Qwen models

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant