-
-
Couldn't load subscription status.
- Fork 10.9k
[Bugfix] Fix allocation & free logic of SingleWriterShmRingBuffer #27117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Fix allocation & free logic of SingleWriterShmRingBuffer #27117
Conversation
939b637 to
a61fb77
Compare
|
/gemini-review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request provides a much-needed fix for the allocation and freeing logic in SingleWriterShmRingBuffer. The previous implementation had several bugs related to pointer management and wrap-around behavior, which could lead to memory over-allocation and incorrect state. The new implementation is a complete rewrite of the core logic, introducing a clearer and more robust state management for the ring buffer. Key improvements include treating start == end as an empty buffer, reserving one byte to distinguish empty from full, and resetting pointers to zero when the buffer is empty to maximize contiguous allocation. The allocation logic is now split into two clear cases (wrapped vs. non-wrapped buffer), which is much easier to reason about. The fix to the free_buf logic, which now correctly advances the start pointer, is also crucial. I have thoroughly reviewed the new logic and it appears to be sound and correct, addressing the described bugs effectively. I did not find any high or critical severity issues with the proposed changes.
|
Acknowledged. Initiating a comprehensive code review for this pull request. |
|
@codex review |
|
Codex Review: Didn't find any major issues. More of your lovely PRs please. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
would be nice if you can help adding your test case to |
Co-authored-by: donglu <[email protected]> Signed-off-by: Kero Liang <[email protected]>
e505b72 to
8165963
Compare
Signed-off-by: Kero Liang <[email protected]>
Signed-off-by: Kero Liang <[email protected]>
Yeah, I put the test case into |
Signed-off-by: Roger Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated an attribute error since np.bool has been deprecated, but otherwise LGTM!
…lm-project#27117) Signed-off-by: Kero Liang <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: donglu <[email protected]> Co-authored-by: Roger Wang <[email protected]> Signed-off-by: Bhagyashri <[email protected]>
Purpose
Fix the buggy free logic of
SingleWriterShmRingBuffer, it may break multimodal shm cache because of over-allocation.details:
data_buffer_startcomputation inSingleWriterShmRingBuffer.free_bufSingleWriterShmRingBuffer.metadatashould be empty on initTest Plan
Test case added in
tests.distributed.test_shm_bufferTest Result
Test Result (main): FAILED
python -m unittest tests.distributed.test_shm_buffer.TestSingleWriterShmRingBuffer.test_allocation_cycles F ====================================================================== FAIL: test_allocation_cycles (tests.distributed.test_shm_buffer.TestSingleWriterShmRingBuffer.test_allocation_cycles) ---------------------------------------------------------------------- Traceback (most recent call last): File "/workspaces/vllm/tests/distributed/test_shm_buffer.py", line 172, in test_allocation_cycles ring_allocate(2) File "/workspaces/vllm/tests/distributed/test_shm_buffer.py", line 160, in ring_allocate mark_allocated_with_assertion(monotonic_id, addr, allocate_size_with_md) File "/workspaces/vllm/tests/distributed/test_shm_buffer.py", line 135, in mark_allocated_with_assertion self.assertEqual(count_allocated(allocated_bitmap[addr : addr + size]), 0) AssertionError: 10 != 0
Ran 1 test in 0.002s
FAILED (failures=1)
Test Result (this PR): OK
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.