Skip to content

Conversation

@copybara-service
Copy link
Contributor

Upgrade handling of partial loads/stores of avx-sized vectors using avx512 instructions

The ultimate goal of this series of changes is to be able to use efficient partial loads/stores to implement transpose kernels, which currently always use memcpy to handle partial loads/stores.

…vx512 instructions

The ultimate goal of this series of changes is to be able to use efficient partial loads/stores to implement transpose kernels, which currently always use memcpy to handle partial loads/stores.

PiperOrigin-RevId: 845642304
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant