Advanced request handling optimizations #1009

iyastreb · 2025-11-10T14:29:08Z

What?

This is a continuation of request handling optimization effort started in #982
In this PR 2 things are optimized:

Release all (except one) pending requests right after posting
Removed nixlUcxIntReq class, moved connection management to nixlUcxBackendH

Performance results

In nixlbench post time decreases 10x for RDMA batch 64k messages of 512B
PR1: #982
PR2: this PR

NIXLBENCH
nixlbench --initiator_seg_type=VRAM --target_seg_type=VRAM --start_block_size=512 --max_block_size=512 --start_batch_size=64000 --max_batch_size=64000 --warmup_iter=10 --num_iter=100 --progress_threads=8 &
 
# Num_threads=8 512:64k cuda_ipc
Branch  Block Size (B)      Batch Size     B/W (GB/Sec)   Avg Lat. (us)  Avg Prep (us)  P99 Prep (us)  Avg Post (us)
main    512                 65000          0.122929       4.2            6022.0         6022.0         121784.0
PR1     512                 65000          0.135798       3.8            6433.0         6433.0         103406.5
PR2     512                 64000          0.124752       4.1            8171.0         8171.0         114922.7
 
# Num_threads=8 512:64k rdma
Branch  Block Size (B)      Batch Size     B/W (GB/Sec)   Avg Lat. (us)  Avg Prep (us)  P99 Prep (us)  Avg Post (us)
main    512                 65000          1.932672       0.3            5540.0         5540.0         13065.2
PR1     512                 65000          2.787826       0.2            5583.0         5583.0         8764.7
PR2     512                 64000          6.710469       0.1            5871.0         5871.0         895.1

SGLANG TTFT
Size  MC     main   PR1    PR2 
1     28     25     22     20
2     47     54     45     45
4     125    129    115    111 
8     243    378    346    332
16    442    699    647    522
32    831    1433   1001   904 
64    1364   2032   1853   1804
128   2583   3869   3611   3060 
256   5232   6618   6083   5501
512   10469  12805  12440  10170
1024  22521  24990  22800  20392

github-actions · 2025-11-10T14:29:17Z

👋 Hi iyastreb! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

brminich · 2025-11-18T09:23:25Z

/build

iyastreb · 2025-11-18T12:35:06Z

/build

iyastreb · 2025-11-18T14:47:33Z

/build

iyastreb added 3 commits November 10, 2025 12:54

Move connection to handle

9519c2d

Keep only 1 pending request per EP batch

abdb803

Disable lock for dedicated workers

883ca96

pull-request-size bot added the size/L label Nov 10, 2025

copy-pr-bot bot temporarily deployed to SWX_AWS November 10, 2025 14:29 Inactive

copy-pr-bot bot had a problem deploying to SWX_AWS November 10, 2025 14:29 Failure

copy-pr-bot bot temporarily deployed to SWX_AWS November 10, 2025 14:29 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 14:29 Inactive

github-actions bot added the external-contribution label Nov 10, 2025

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 14:29 Inactive

Formatting

363f98e

copy-pr-bot bot temporarily deployed to SWX_AWS November 10, 2025 15:30 Inactive

copy-pr-bot bot had a problem deploying to SWX_AWS November 10, 2025 15:30 Failure

copy-pr-bot bot temporarily deployed to SWX_AWS November 10, 2025 15:30 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 15:30 Inactive

FIxed NPE

a5d1207

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 15:32 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS November 10, 2025 15:32 Inactive

copy-pr-bot bot had a problem deploying to SWX_AWS November 10, 2025 15:32 Failure

copy-pr-bot bot temporarily deployed to SWX_AWS November 10, 2025 15:32 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 10, 2025 15:35 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 11, 2025 13:09 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS November 11, 2025 13:09 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 18, 2025 09:23 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 18, 2025 09:24 Inactive

brminich approved these changes Nov 18, 2025

View reviewed changes

brminich enabled auto-merge (squash) November 18, 2025 09:24

Merge branch 'main' into ucx-drop-requests-optimization

dd36a31

copy-pr-bot bot temporarily deployed to SWX_AWS November 18, 2025 10:04 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 18, 2025 10:04 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 18, 2025 10:05 Inactive

Merge branch 'main' into ucx-drop-requests-optimization

7e16453

copy-pr-bot bot temporarily deployed to SWX_AWS November 18, 2025 10:14 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 18, 2025 10:14 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS November 18, 2025 10:14 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 18, 2025 10:17 Inactive

Merge branch 'main' into ucx-drop-requests-optimization

a8af5fb

copy-pr-bot bot temporarily deployed to GITLAB November 18, 2025 12:10 Inactive

copy-pr-bot bot temporarily deployed to SWX_AWS November 18, 2025 12:10 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 18, 2025 12:11 Inactive

brminich merged commit d33d8a9 into ai-dynamo:main Nov 18, 2025
21 checks passed

iyastreb deleted the ucx-drop-requests-optimization branch November 18, 2025 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Advanced request handling optimizations #1009

Advanced request handling optimizations #1009

Uh oh!

iyastreb commented Nov 10, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 10, 2025

Uh oh!

brminich commented Nov 18, 2025

Uh oh!

iyastreb commented Nov 18, 2025

Uh oh!

iyastreb commented Nov 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Advanced request handling optimizations #1009

Advanced request handling optimizations #1009

Uh oh!

Conversation

iyastreb commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What?

Performance results

Uh oh!

github-actions bot commented Nov 10, 2025

Uh oh!

brminich commented Nov 18, 2025

Uh oh!

iyastreb commented Nov 18, 2025

Uh oh!

iyastreb commented Nov 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

iyastreb commented Nov 10, 2025 •

edited

Loading