你好,我在尝试把 Phi-mini-MoE模型的 attention projection 编译成 QNN AOT 图时遇到一个问题。
Qwen3 的 AOT 示例中,attention/MLP linear 可以用 Conv2D + LPBQ W4A16 路径编译成功,例如:
- Conv2D weight layout: [1, 1, In, Out]
- quant recipe: LPBQ / w4a16
- Qwen3 SHA 模式下 q/k/v 按 head 切分后也可以编译
但在 Phi-MoE 模型上,我尝试让 attention 的 q_proj/k_proj/o_proj 也走类似路径时,QNN prepare 失败。
Phi 当前导出的 attention 权重情况:
- q_proj: 有 weight + scale1 + scale2,类似 LPBQ/int4
- k_proj: 有 weight + scale1 + scale2,类似 LPBQ/int4
- o_proj: 有 weight + scale1 + scale2,类似 LPBQ/int4
- v_proj: 是 W8A16 风格,weight + scale + zero_point
我尝试过两种方式:
- 整块 Conv2D LPBQ
例如:
- q_proj.weight shape: [1, 1, 4096, 4096]
- k_proj.weight shape: [1, 1, 4096, 1024]
- o_proj.weight shape: [1, 1, 4096, 4096]
QNN lower 可以生成 Conv2d_w_blk_exp_scale,但 graph prepare 失败,日志里有:
no properties registered for q::GroupedConv2d_w_scale
Selecting disabled op ... q::pack_4bit_lpbq_weights_2x
Selecting disabled op ... q::pack_4bit_lpbq_scales
Graph prepare failed with err:-1
- 仿照 Qwen2/Qwen3 SHA,把 q/k 按 head 切分成小 Conv2D
例如:
- q_proj_sha.0 ... q_proj_sha.31
- k_proj_sha.0 ... k_proj_sha.7
但这次更早失败,日志里出现:
"model.layers.3.self_attn.q_proj_sha.0" generated: could not create op
"model.layers.3.self_attn.q_proj_sha.1" generated: could not create op
...
"model.layers.3.self_attn.k_proj_sha.0" generated: could not create op
...
Received signal11 - SIGSEGV
谢谢!
你好,我在尝试把 Phi-mini-MoE模型的 attention projection 编译成 QNN AOT 图时遇到一个问题。
Qwen3 的 AOT 示例中,attention/MLP linear 可以用 Conv2D + LPBQ W4A16 路径编译成功,例如:
但在 Phi-MoE 模型上,我尝试让 attention 的 q_proj/k_proj/o_proj 也走类似路径时,QNN prepare 失败。
Phi 当前导出的 attention 权重情况:
我尝试过两种方式:
例如:
QNN lower 可以生成 Conv2d_w_blk_exp_scale,但 graph prepare 失败,日志里有:
no properties registered for q::GroupedConv2d_w_scale
Selecting disabled op ... q::pack_4bit_lpbq_weights_2x
Selecting disabled op ... q::pack_4bit_lpbq_scales
Graph prepare failed with err:-1
例如:
但这次更早失败,日志里出现:
"model.layers.3.self_attn.q_proj_sha.0" generated: could not create op
"model.layers.3.self_attn.q_proj_sha.1" generated: could not create op
...
"model.layers.3.self_attn.k_proj_sha.0" generated: could not create op
...
Received signal11 - SIGSEGV
谢谢!