-
Notifications
You must be signed in to change notification settings - Fork 7
Add atomic operations support for Complex numbers #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add atomic operations support for Complex numbers #61
Conversation
Implements atomic operations for Complex{Float32} and Complex{Float64}
by reinterpreting them as UInt64/UInt128 and using integer atomics.
- Uses CAS loops for modify! operations on Complex types
- Adds tests for all atomic operations with complex numbers
- Maintains full compatibility with existing functionality
Fixes JuliaConcurrent#37
Generated with GitHub Copilot
00e703d to
9abbe99
Compare
| # Note: This is NOT fully atomic (components updated separately) | ||
| # but works for both ComplexF32 and ComplexF64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof this is a no-go in my opinion. You will thus easily get torn writes.
I think this would need to use 128-byte atomics
| # Complex atomic operations - separate atomics on real and imaginary parts | ||
| # This works for operations that decompose component-wise (+, -, right) | ||
| # Note: This provides per-component atomicity, not full Complex atomicity | ||
| # (other threads may observe intermediate states, but final result is correct) | ||
| @inline function _cuda_atomic_modify!(ptr::Core.LLVMPtr{Complex{T},A}, op::OP, x::Complex{T}) where {T<:Union{Float32,Float64},A,OP} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, you are not gurantueed that a user is only using one kind of atomic operation on a memory location.
(e.g. someone doing a mul for good measure).
|
How does C++ implement them (if at all)? I think we need to guarantee full atomicity, and thus use a compare and swap loop on the byte value. |
|
I'm not an expert of atomic operations. I just need it for my case, and with the help of Copilot just wrote this that make sense to me. I have tested with the code of my first comment, and it seems to work. Do you think that that test is not enough? |
|
No I don't think the test is sufficient, and you are right for some algorithms you might not care about "full atomicity" and partial atomicity might be sufficient. For me the crux is that sofar all operations we currently have in Atomix promise full atomicity. For a made up example, imagine a algorithm where even odd lane performs an atomic addition and every even lane performs an atomic multiplication. That's a weird thing to do, and I don't know of any place this comes up, but that is besides the point. With this interface we are trying to implement a general set of operations, with consistent semantics, and I much prefer an error than to chase down memory semantics bugs later. |
Fixes #37
This PR adds support for atomic operations on
Complex{Float32}andComplex{Float64}arrays across CPU and all GPU backends.Implementation
UnsafeAtomics) and all GPU backends: CUDA, Metal, oneAPI, and OpenCL+and-operations use native atomic add/sub instructionsswap,replace) use atomic CAS on individual componentsTesting
ComplexF32andComplexF64Generated with GitHub Copilot.