Update README with token match rate on text backbone by sdeeptan-aws · Pull Request #53 · aws-neuron/neuronx-distributed-inference

sdeeptan-aws · 2026-02-19T19:46:59Z

Description

Updated Qwen2.5-Omni-7B contrib model README with 100% token match accuracy on text backbone. Qwen2.5-Omni is a multimodal model supporting vision, audio, and text. AutoModelForCausalLM does not work for multimodal models — the specific text backbone class must be used to load the HF reference. Some multimodal configs may be missing attributes expected by the text backbone (e.g., output_attentions) and require config patching. With the correct text backbone extraction, the model achieves 100% token match.

Model Information

Model Name: Qwen2.5-Omni-7B
Model Architecture: Multimodal (omni — vision, audio, text) model (Qwen2-based decoder-only transformer text backbone)
Purpose: Multimodal understanding and text generation

Checklist

Required Components

Accuracy Test (test/integration/test_model.py)
- Validates model generation and coherence
- Performance benchmarks (TTFT, throughput)
- Test can compile and run the model on Neuron
README.md with the following sections:
- Usage Example: Clear code example showing how to use the model
- Compatibility Matrix: Table showing tested Neuron SDK versions and instance types
- Example Checkpoints: Links to compatible model checkpoints
- Testing Instructions: Command to run the test suite for the model
Source Code (src/)
- Modeling code following NxD Inference patterns (unchanged in this PR)

Optional Components

Unit Tests (CPU or Neuron-based)

Folder Structure

/contrib/models/Qwen2.5-Omni-7B/
  README.md
  /src
    modeling_qwen2_5_omni.py
  /test
    /integration
      test_model.py

Testing

Model was compiled and tested with TP=2, batch_size=1, seq_len=128, bfloat16. Text backbone validated only — vision/audio modalities not yet verified.

Text backbone extraction: AutoModelForCausalLM fails for multimodal models — must use the specific text backbone class
Config patching: Some multimodal configs are missing attributes expected by the text backbone (e.g., output_attentions) and need patching

Test Results:

Test	Status	Result
Smoke Test	✅ PASS	Model loads successfully
Token Matching	✅ PASS	100% match (text backbone)
TTFT (P50)	✅ PASS	50.15ms
Throughput	✅ PASS	19.82 tok/s

Compatibility

Tested with:

Instance Type(s): Trn1
Configuration: TP=2, batch_size=1, seq_len=128, bfloat16

Additional Information

Omni-modal: Supports vision, audio, and text — text backbone validated independently
AutoModelForCausalLM doesn't work: Multimodal models register with different auto classes. Use the specific text backbone class for HF reference loading.
Config patching may be needed: Multimodal configs can be missing text-only attributes like output_attentions

Related Issues

N/A

vLLM Integration

This model/feature is intended for use with vLLM
Documentation includes vLLM registration instructions

By submitting this PR, I confirm that:

I have read and followed the contributing guidelines
This is a community contribution and may have limited testing compared to officially-supported models
The code follows best practices and is well-documented
All required components listed above are included

aws-yishanm

Approved because Readme and test were present.

Update README with token match rate on text backbone

4099b1f

aws-yishanm approved these changes Feb 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Update README with token match rate on text backbone#53

Update README with token match rate on text backbone#53
sdeeptan-aws wants to merge 1 commit intoaws-neuron:mainfrom
sdeeptan-aws:qwenomni7b

sdeeptan-aws commented Feb 19, 2026

Uh oh!

aws-yishanm left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

sdeeptan-aws commented Feb 19, 2026

Description

Model Information

Checklist

Required Components

Optional Components

Folder Structure

Testing

Compatibility

Additional Information

Related Issues

vLLM Integration

Uh oh!

aws-yishanm left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants