-
Notifications
You must be signed in to change notification settings - Fork 805
[SPIR-V] Fix precision for dot2add
#7861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced this is the correct interpretation of the description of dot2add.
A 2-dimensional floating point dot-product of half2 vectors with add. Multiplies the elements of the two half-precision float input vectors together and sums the results into the 32-bit float accumulator. This instructions operates within a single 32-bit wide SIMD lane. The inputs are 16-bit quantities packed into the same lane.
This instructions operates within a single 32-bit wide SIMD lane. The inputs are 16-bit quantities packed into the same lane.
This suggest that the multiplications should be done as 16-bit operations.
sums the results into the 32-bit float accumulator.
This suggests that the sum is done as a 32-bit operation.
The correct code might be:
%2 = OpMul %vhalf_n %input %input
%3 = OpFConvert %vfloat_n %2
// for i = 0 to n-1
%ex_i = OpCompositeExtract %float %3 <i>
%sum_1 = OpFAdd %float %ex_0 %ex_1
// for i = 2 to n-1
%sum_i = OpFAdd %sum_<i-1> %ex_i
Repeat the instructions, do not put a loop in the spir-v.
You are right, thanks for the detailed explanation! |
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Fixes #7695 (part of offload test suite)
HLSL spec indicates that the elements are mutilplied with
half-precisionbut the summation results in afloat. OpDot requires theResultTypeto be the same as the vector'sComponentType, so this opcode cannot be used.The fix is to untangle
OpDot-- multiplyhalf2vectors and convert them tofloatbefore summing the elements.