Skip to content

Conversation

@blissb-positron
Copy link
Contributor

I found a value that I believe to be rounded incorrectly when using FormatInfo.BFloat16. This happened when using the TowardsZero rounding mode.

Minimal Example:

import gfloat
from gfloat.formats import format_info_bfloat16

a = 6.6461399789245764e+35
b = 6.620178494631905e+35

rounded_a = gfloat.round_float(format_info_bfloat16, a, gfloat.RoundMode.TowardZero)
rounded_b = gfloat.round_float(format_info_bfloat16, b, gfloat.RoundMode.TowardZero)

print("a:                  :", a)
print("b:                  :", b)
print("rounded_a           :", rounded_a)
print("rounded_b           :", rounded_b)

print("b == rounded_b      :", b == rounded_b) # True only if `b` is in the space of BFloat16

print("(rounded_a < b < a) :", rounded_a < b and b < a) # Rounding skipped `b`

Output Before Fix:

a:                  : 6.6461399789245764e+35
b:                  : 6.620178494631905e+35
rounded_a           : 6.594217010339231e+35
rounded_b           : 6.620178494631905e+35
b == rounded_b      : True
(rounded_a < b < a) : True

Output After Fix:

a:                  : 6.6461399789245764e+35
b:                  : 6.620178494631905e+35
rounded_a           : 6.620178494631905e+35
rounded_b           : 6.620178494631905e+35
b == rounded_b      : True
(rounded_a < b < a) : False

a is the value that gets rounded incorrectly. b is a value that can be encoded by BFloat19.
a is larger than b, but rounding a down yields a value that is less than b.
It appears to have skipped a value when rounding down.

I dug a little and believe that the following is the root cause:

expval = int(math.floor(math.log2(vpos)))

np.log2(vpos) appears to be rounding its result up (at least on my platform). It ends up outputting an integer (but still a float type) despite the input not being a power of 2.

Calculation:

import numpy as np

vpos = 6.6461399789245764e+35 # = a

print("vpos            :", vpos)
print("log2(vpos)      :", np.log2(vpos))
print("floor(...)      :", np.floor(np.log2(vpos)))
print("int(floor(...)) :", int(np.floor(np.log2(vpos))))
print("int(vpos)       :", int(vpos))
print("2**119          :", 2**119)
print("int(vpos) < 2**119 :", int(vpos) < 2**119)

Output:

vpos            : 6.6461399789245764e+35
log2(vpos)      : 119.0
floor(...)      : 119.0
int(floor(...)) : 119
int(vpos)       : 664613997892457641303998350787346432
2**119          : 664613997892457936451903530140172288
int(vpos) < 2**119 : True

Since it is flooring log2(vpos), I believe the code relies on vpos being greater than 2**(floor(log2(vpos))).
However, since log2(vpos) rounds up to an integer, floor does not lower it to the correct value.

This causes what would be the lsb of the mantissa to be rounded off.

I have not used the array rounding, but I expect that the following has the same issue:

expval = to_int(xp.floor(xp.log2(absv_masked)))

@awf
Copy link
Collaborator

awf commented Aug 9, 2025

Fabulous, thanks. I'll fix the CI and merge.

@awf awf merged commit ce896ef into graphcore-research:main Aug 9, 2025
1 check failed
@awf awf mentioned this pull request Aug 9, 2025
awf added a commit to awf/gc-gfloat that referenced this pull request Aug 20, 2025
@awf
Copy link
Collaborator

awf commented Aug 21, 2025

Array rounding fixed in #51

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants