Qwen3 next #4039

grimoire · 2025-10-14T10:48:43Z

Qwen3-next require kernels from:

https://github.com/Dao-AILab/causal-conv1d
https://github.com/fla-org/flash-linear-attention

We need env check for different model-device combinations.

windreamer

Do we need to consider to integrate ssm cache pool in PD migration request?

lvhan028 · 2025-11-01T14:36:04Z

opencompass evaluation failed

2025-11-01 22:35:12,565 - lmdeploy - ERROR - engine.py:1234 - exception happened: <class 'IndexError'> index 1032 is out of bounds for axis 0 with size 1032
Traceback (most recent call last):
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 1229, in async_loop
    await self._async_loop_main(resp_que=resp_que,
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 1128, in _async_loop_main
    forward_inputs, next_running = await inputs_maker.prefetch_next_inputs()
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 320, in prefetch_next_inputs
    return await self._send_next_inputs_impl(prefill, True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 285, in _send_next_inputs_impl
    forward_inputs = self._make_forward_inputs(prefill, enable_empty)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 227, in _make_forward_inputs
    return self.engine._make_forward_inputs(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 897, in _make_forward_inputs
    scheduler_output = scheduler.schedule(is_prefill=prefill, prealloc_size=prealloc_size)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/paging/scheduler.py", line 299, in schedule
    output = self._schedule_prefill(0)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/utils.py", line 271, in __func_warpper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/paging/scheduler.py", line 232, in _schedule_prefill
    if not __evict_for_seq(seq, waiting):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/paging/scheduler.py", line 213, in __evict_for_seq
    return eviction_helper.evict_for_seq(seq, evictable, prealloc_size)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/paging/eviction_helper/recompute_eviction_helper.py", line 74, in _evict_for_ssm
    state_manager.free(evict_seq)
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/paging/state_manager.py", line 55, in free
    self.allocator.free(seq.logical_state)
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/paging/state_manager.py", line 29, in free
    self._free_states[num_used] = state_id
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^
IndexError: index 1032 is out of bounds for axis 0 with size 1032

serving:

 lmdeploy serve api_server Qwen/Qwen3-Next-80B-A3B-Instruct --tp 4

grimoire · 2025-11-02T11:17:30Z

Fixed

lvhan028 · 2025-11-03T08:22:21Z

The OC evaluation test failed due to an issue where quite a few prompts from aime2025 dataset became trapped in repetition.

grimoire · 2025-11-03T12:40:17Z

Fixed

grimoire added 8 commits September 18, 2025 15:39

WIP

bbe8729

merge main

d00c39d

Merge branch 'main' into qwen3-next

b935db6

wip

4549a50

WIP

290f73d

first

1996e9a

fix chat

e048e19

add env check

111c65d

lvhan028 requested a review from windreamer October 15, 2025 07:47

lvhan028 added the enhancement New feature or request label Oct 15, 2025

windreamer reviewed Oct 15, 2025

View reviewed changes

grimoire added 4 commits October 16, 2025 16:54

add comment

5147183

fix pad

822aa84

cudagraph

e1e9856

merge main

fb448f8

windreamer approved these changes Oct 30, 2025

View reviewed changes

lvhan028 self-requested a review November 1, 2025 14:36

fix free seq with state

e562146

grimoire added 2 commits November 3, 2025 20:29

init cache

688d545

fix

42873f5

mem pool

7d346be

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Qwen3 next #4039

Qwen3 next #4039

grimoire commented Oct 14, 2025

Uh oh!

windreamer left a comment

Uh oh!

lvhan028 commented Nov 1, 2025 •

edited

Loading

Uh oh!

grimoire commented Nov 2, 2025

Uh oh!

lvhan028 commented Nov 3, 2025

Uh oh!

grimoire commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Qwen3 next #4039

Are you sure you want to change the base?

Qwen3 next #4039

Conversation

grimoire commented Oct 14, 2025

Uh oh!

windreamer left a comment

Choose a reason for hiding this comment

Uh oh!

lvhan028 commented Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

grimoire commented Nov 2, 2025

Uh oh!

lvhan028 commented Nov 3, 2025

Uh oh!

grimoire commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lvhan028 commented Nov 1, 2025 •

edited

Loading