Compatible with MoE block - tiled_mlp_forward_common?

In a MoE (Mixture of Experts) model, each MoE block typically uses an MLP as its component. During the forward pass, the hidden_states tensor is often flattened using `hidden_states.view(-1, hidden_states.shape[-1])`. As a result, when tiled_mlp_forward_common tries to unpack the shape with `bs, seqlen, hidden = x.shape`, it throws an error: `ValueError: not enough values to unpack (expected 3, got 2)`.

How can this issue be resolved? Thanks so much.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Compatible with MoE block - tiled_mlp_forward_common? #231

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Compatible with MoE block - tiled_mlp_forward_common? #231

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions