Skip to content

[IMPORT-EXPORT] support interleaved weights for sonic-moe#381

Merged
mayank31398 merged 15 commits intomainfrom
gate
Feb 22, 2026
Merged

[IMPORT-EXPORT] support interleaved weights for sonic-moe#381
mayank31398 merged 15 commits intomainfrom
gate

Conversation

@mayank31398
Copy link
Collaborator

No description provided.

Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @mayank31398, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates support for interleaved weights into the model conversion pipeline, enabling sonic-moe models to be imported and exported correctly when using this weight arrangement. The changes involve propagating a use_interleaved_weights flag through the import process, updating how MLP weights are configured and transformed, and refining utility functions to handle conditional interleaving. This enhancement improves the robustness and compatibility of the model conversion system for diverse model architectures.

Highlights

  • Interleaved Weights Support: Introduced support for handling interleaved weights in model import and export processes, particularly for sonic-moe models.
  • Flexible Weight Handling: Modified core utility functions interleave_up_gate_tensor_for_mlp and split_up_gate_tensor_for_mlp to conditionally interleave or split weights based on a flag and specified dimension.
  • Config and State Dict Updates: Updated model-specific configuration import functions and state dictionary transformation logic for Granite, GraniteMoEHybrid, and Llama models to correctly apply the use_interleaved_weights setting.
  • Enhanced Testing: Extended existing model conversion tests to include use_interleaved_weights as a parameterized argument, ensuring comprehensive coverage for this new functionality.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • lm_engine/hf_models/model_conversion/init.py
    • Modified import_from_huggingface to accept and pass arbitrary keyword arguments to model-specific configuration import functions.
  • lm_engine/hf_models/model_conversion/granite.py
    • Updated _import_granite_config to accept **kwargs and set use_interleaved_weights on MLP blocks if the argument is provided.
  • lm_engine/hf_models/model_conversion/granitemoehybrid.py
    • Removed the _split_and_reorder_for_glu utility function.
    • Updated _import_granitemoehybrid_config to accept **kwargs and apply use_interleaved_weights to MLP blocks.
    • Modified _import_granitemoehybrid_state_dict and _export_granitemoehybrid_state_dict to utilize the new interleave_up_gate_tensor_for_mlp and split_up_gate_tensor_for_mlp functions with the use_interleaved_weights flag.
    • Removed checks for use_interleaved_weights and use_interleaved_weights_for_shared_experts in _export_granitemoehybrid_config.
  • lm_engine/hf_models/model_conversion/llama.py
    • Modified _import_llama_config to accept **kwargs and set use_interleaved_weights on MLP blocks.
    • Updated _import_llama_state_dict and _export_llama_state_dict to use the enhanced interleave_up_gate_tensor_for_mlp and split_up_gate_tensor_for_mlp functions with the use_interleaved_weights flag.
  • lm_engine/hf_models/modeling_utils/mlp_blocks/mlp.py
    • Enhanced interleave_up_gate_tensor_for_mlp and split_up_gate_tensor_for_mlp to accept is_interleaved and dim parameters, allowing for conditional interleaving and splitting along a specified dimension.
  • tests/hf_models/single_gpu/model_conversion_test.py
    • Extended test_llama_model_conversion, test_granite_model_conversion, and test_granitemoehybrid_model_conversion to include use_interleaved_weights as a parameterized test argument.
  • tests/hf_models/test_common.py
    • Updated model_conversion_test to pass the use_interleaved_weights parameter during the import_from_huggingface call.
Activity
  • The pull request was created by mayank31398.
  • New functionality was introduced to support interleaved weights in model conversion.
  • Core model conversion logic and related test cases were updated to reflect these changes.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive support for interleaved weights in the model conversion process, specifically targeting MLP blocks across Granite, GraniteMoEHybrid, and Llama models. The changes involve extending the import_from_huggingface function to accept and propagate **kwargs, which are then used to pass the use_interleaved_weights flag to model-specific configuration import functions. The core logic for handling interleaved and non-interleaved weights during tensor manipulation is encapsulated in the new interleave_up_gate_tensor_for_mlp and split_up_gate_tensor_for_mlp functions, which are robustly implemented to handle different dimensions. The removal of hardcoded use_interleaved_weights=False checks and the addition of assert len(kwargs) == 0 ensure proper argument handling and prevent silent failures. Furthermore, the updated test suite with parameterized use_interleaved_weights ensures thorough coverage of this new functionality. The code is well-structured, readable, and effectively addresses the stated objective.

Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
Signed-off-by: Mayank Mishra <mayank31398@gmail.com>
@mayank31398 mayank31398 merged commit 312a22e into main Feb 22, 2026
2 checks passed
@mayank31398 mayank31398 deleted the gate branch February 22, 2026 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant