feat: add llama model_type support by Krigsexe · Pull Request #2 · xigh/herbert-rs

Krigsexe · 2026-04-20T00:31:21Z

Summary

Add "llama" to the supported model_type values in config.rs
Map it to Qwen3 family since the architecture is identical (RMSNorm, RoPE, SwiGLU, GQA, no bias, same SafeTensors layer naming)
Disable QK norms for llama models (Llama does not use per-head QK RMS norms unlike Qwen3)

Motivation

Llama-architecture models like PleIAs/Baguettotron (321M params, 80 layers, Apache 2.0) cannot currently run on herbert-rs because the model_type: "llama" is not recognized. The underlying architecture is functionally identical to Qwen3 -- same layer structure, same SafeTensors naming convention (model.layers.X.self_attn.q_proj.weight), same activation function.

Changes

crates/core/src/config.rs: 3 changes, 1 file, +7/-3 lines

Testing

Tested with PleIAs/Baguettotron (Q4 backend) on Ryzen 5 3600 (AVX2):

Decode: 38.8 tok/s
Prefill: 188.9 tok/s
Model load: 0.7s

Map "llama" model_type to Qwen3 family since the architecture is identical (RMSNorm, RoPE, SwiGLU, GQA, no bias). The only difference is that Llama models do not use per-head QK norms, which is handled by checking the model_type string. This enables running Llama-architecture models like PleIAs/Baguettotron (321M params, 80 layers) directly on herbert-rs without conversion. Tested with Baguettotron-Q4 on Ryzen 5 3600 (AVX2): 38.8 tok/s decode, 188.9 tok/s prefill.

Krigsexe mentioned this pull request Apr 20, 2026

feat: add --raw-prompt flag to skip chat template wrapping #3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add llama model_type support#2

feat: add llama model_type support#2
Krigsexe wants to merge 1 commit into
xigh:mainfrom
Krigsexe:add-llama-support

Krigsexe commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Krigsexe commented Apr 20, 2026

Summary

Motivation

Changes

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants