Skip to content

Fix the attention code to allow GPT-2 weight loading #198

@ramon-astudillo

Description

@ramon-astudillo

Upgrade the easier to understand GPT-2 attention code to allow loading GPT-2 weights.

i.e. avoid separate loaders/code for pre-trained and non pre-trained model weights https://github.com/LxMLS/lxmls-toolkit/blob/master/lxmls/transformers/model.py#L123

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions