You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Part of WS1 — Full Batch-Invariant Forward Chain (epic: #)
Why
"Aligned" is only meaningful relative to a pinned set of dtypes. RL training runs in BF16, so BF16 invariance is mandatory; FP32 is the reference for tolerance. Without pinning this, different ops could be validated under different dtypes and the chain-level guarantee would be meaningless. This issue locks the dtype axis of the #108 contract.
Scope
Pin the dtype set every WS1 op validates against, and block finalization of the #108 contract until this axis is resolved.
Declare the tested dtype set: BF16 mandatory (the RL training dtype), FP32 as the reference path for computing tolerances; FP16 optional if supported; FP8 explicitly out of scope this month.
Specify the accumulation policy (FP32 accumulation for BF16 inputs) as part of the contract so ops don't each choose differently; TF32 explicitly enabled or disabled.
Part of WS1 — Full Batch-Invariant Forward Chain (epic: #)
Why
"Aligned" is only meaningful relative to a pinned set of dtypes. RL training runs in BF16, so BF16 invariance is mandatory; FP32 is the reference for tolerance. Without pinning this, different ops could be validated under different dtypes and the chain-level guarantee would be meaningless. This issue locks the dtype axis of the #108 contract.
Scope
Pin the dtype set every WS1 op validates against, and block finalization of the #108 contract until this axis is resolved.
Initial per-op recommendations (to be ratified in the contract):
Out of scope
Acceptance criteria
Notes
Planned PRs