You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Slightly improving 4 bit mat mul performance through better engage ALU pipes by splitting uint to float conversion operation. (#15447)
Summary:
This diff makes a slight improvement to the performance of 4-bit matrix multiplication by better utilizing the ALU pipes. This is achieved by splitting the `uint` to `float` conversion operation.
Differential Revision: D85779855
0 commit comments