Skip to content

[WIP] Sudhu/megatron ifu core 0.16.0 sync#119

Draft
sudhu2k wants to merge 695 commits intorocm_devfrom
sudhu/Megatron-IFU-core_0.16.0_sync
Draft

[WIP] Sudhu/megatron ifu core 0.16.0 sync#119
sudhu2k wants to merge 695 commits intorocm_devfrom
sudhu/Megatron-IFU-core_0.16.0_sync

Conversation

@sudhu2k
Copy link
Collaborator

@sudhu2k sudhu2k commented Mar 3, 2026

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Phlip79 and others added 30 commits January 6, 2026 21:51
…ice correctly for CUDA context creation / hypothetical memory allocations (NVIDIA#2710)

Co-authored-by: Deepak Narayanan <dnarayanan@nvidia.com>
…NVIDIA#2723)

Signed-off-by: John St John <jstjohn@nvidia.com>
Signed-off-by: John St. John <jstjohn@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Robin Zhang <robinz@nvidia.com>
Co-authored-by: root <root@gpu-h100-0348.cm.cluster>
Co-authored-by: root <root@gpu-h100-0193.cm.cluster>
Co-authored-by: root <root@gpu-h100-0082.cm.cluster>
Co-authored-by: root <root@gpu-h100-0495.cm.cluster>
Co-authored-by: William Dykas <wdykas@cw-pdx-cs-001-vscode-02.cm.cluster>
Co-authored-by: root <root@gpu-h100-0213.cm.cluster>
Co-authored-by: root <root@gpu-h100-0435.cm.cluster>
Co-authored-by: root <root@gpu-h100-0188.cm.cluster>
Co-authored-by: root <root@gpu-h100-0032.cm.cluster>
Co-authored-by: root <root@gpu-h100-0023.cm.cluster>
Co-authored-by: root <root@gpu-h100-0368.cm.cluster>
Co-authored-by: root <root@gpu-h100-0203.cm.cluster>
Co-authored-by: root <root@gpu-h100-0229.cm.cluster>
Co-authored-by: root <root@gpu-h100-0123.cm.cluster>
Co-authored-by: root <root@gpu-h100-0217.cm.cluster>
Co-authored-by: root <root@gpu-h100-0496.cm.cluster>
Co-authored-by: root <root@gpu-h100-0022.cm.cluster>
Co-authored-by: root <root@gpu-h100-0176.cm.cluster>
Co-authored-by: root <root@gpu-h100-0190.cm.cluster>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Robin Zhang <robinz@nvidia.com>
Signed-off-by: kunlunl <kunlunl@nvidia.com>
Signed-off-by: jianbinc <shjwudp@gmail.com>
Co-authored-by: jianbinc <shjwudp@gmail.com>
Co-authored-by: Cory Ye <44509866+cspades@users.noreply.github.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Phlip79 and others added 30 commits January 29, 2026 21:33
…g `--decoder-first-pipeline-num-layers` & `--decoder-last-pipeline-num-layers` (NVIDIA#2947)
Signed-off-by: Maanu Grover <maanug@nvidia.com>
Co-authored-by: Philip Petrakian <ppetrakian@nvidia.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: root <root@gpu-h100-0348.cm.cluster>
Co-authored-by: root <root@gpu-h100-0193.cm.cluster>
Co-authored-by: root <root@gpu-h100-0082.cm.cluster>
Co-authored-by: root <root@gpu-h100-0495.cm.cluster>
Co-authored-by: William Dykas <wdykas@cw-pdx-cs-001-vscode-02.cm.cluster>
Co-authored-by: root <root@gpu-h100-0213.cm.cluster>
Co-authored-by: root <root@gpu-h100-0435.cm.cluster>
Co-authored-by: root <root@gpu-h100-0188.cm.cluster>
Co-authored-by: root <root@gpu-h100-0032.cm.cluster>
Co-authored-by: root <root@gpu-h100-0023.cm.cluster>
Co-authored-by: root <root@gpu-h100-0368.cm.cluster>
Co-authored-by: root <root@gpu-h100-0203.cm.cluster>
Co-authored-by: root <root@gpu-h100-0229.cm.cluster>
Co-authored-by: root <root@gpu-h100-0123.cm.cluster>
Co-authored-by: root <root@gpu-h100-0217.cm.cluster>
Co-authored-by: root <root@gpu-h100-0496.cm.cluster>
Co-authored-by: root <root@gpu-h100-0261.cm.cluster>
…ernorm. (NVIDIA#2434)

Co-authored-by: Yuzhong Wang <yuzhongw@nvidia.com>
Signed-off-by: Hongbin Liu <hongbinl@nvidia.com>
Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>
Co-authored-by: Youngeun Kwon <youngeunk@nvidia.com>
Signed-off-by: Keshav Santhanam <ksanthanam@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Co-authored-by: Xin Yao <xiny@nvidia.com>
Co-authored-by: Rabeeh Mahabadi <rkarimimahab@nb-hel-cs-001-vscode-02.cm.cluster>
Co-authored-by: Sanjeev Satheesh <sasatheesh@nvidia.com>
Co-authored-by: Deepak Narayanan <dnarayanan@nvidia.com>
Signed-off-by: Santosh Bhavani <santosh.bhavani@live.com>
Co-authored-by: Xin Yao <xiny@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.