Skip to content

Conversation

@Xia-Weiwen
Copy link
Collaborator

It brings about 1% E2E improvement when running int8 VIT on 4 cores.

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 23, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3230

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 64eae18 with merge base f3fc5e7 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 23, 2025
@Xia-Weiwen Xia-Weiwen added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Oct 23, 2025
auto tmp2 = tmp1.round();
auto tmp3 = tmp2 + vec_beta1;
auto tmp1 = at::vec::fmadd(tmp0, vec_sum_scale, vec_beta1);
auto tmp3 = tmp1.round();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also apply the optimization to the below masked vectorization part?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks.

auto tmp6 = tmp5.round();
auto tmp7 = tmp6 + vec_beta2;
auto tmp5 = at::vec::fmadd(tmp4, vec_alpha, vec_beta2);
auto tmp7 = tmp5.round();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks.

auto tmp6 = tmp5.round();
auto tmp7 = tmp6 + vec_beta2;
auto tmp5 = at::vec::fmadd(tmp4, vec_alpha, vec_beta2);
auto tmp7 = tmp5.round();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: not user facing Use this tag if you don't want this PR to show up in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants