Add atomic operations support for Complex numbers #61

albertomercurio · 2025-11-11T22:52:48Z

Fixes #37

This PR adds support for atomic operations on Complex{Float32} and Complex{Float64} arrays across CPU and all GPU backends.

Implementation

Component-wise atomic operations: Complex numbers are atomically updated by performing separate atomic operations on their real and imaginary components
Supported on CPU (via UnsafeAtomics) and all GPU backends: CUDA, Metal, oneAPI, and OpenCL
Optimized paths for + and - operations use native atomic add/sub instructions
Other operations (like swap, replace) use atomic CAS on individual components
Multiple dispatch cleanly handles Complex vs non-Complex types

Testing

Added 13 CPU tests covering all atomic operations (get, set, modify, swap, replace) for both ComplexF32 and ComplexF64
Added 3 GPU tests per backend (CUDA, Metal, oneAPI, OpenCL) testing CAS, modify, and sugar syntax
All existing tests continue to pass
Tested with KernelAbstractions.jl:

using CUDA
using KernelAbstractions
import Atomix

T = ComplexF64

x = CUDA.rand(T, 100) .+ 0.5f0
res = CUDA.zeros(T, 1)

@kernel cpu=false inbounds=true function my_kernel(res, @Const(x))
    i = @index(Global)
    Atomix.@atomic res[1] += x[i]
end

kernel = my_kernel(KernelAbstractions.get_backend(x))
kernel(res, x; ndrange = length(x))

Array(res)[1]  # Matches sum(x)
sum(x)

Generated with GitHub Copilot.

Implements atomic operations for Complex{Float32} and Complex{Float64} by reinterpreting them as UInt64/UInt128 and using integer atomics. - Uses CAS loops for modify! operations on Complex types - Adds tests for all atomic operations with complex numbers - Maintains full compatibility with existing functionality Fixes JuliaConcurrent#37 Generated with GitHub Copilot

…d oneAPI extensions

vchuravy · 2025-11-16T18:17:26Z

ext/AtomixCUDAExt.jl

+# Note: This is NOT fully atomic (components updated separately)
+# but works for both ComplexF32 and ComplexF64


Oof this is a no-go in my opinion. You will thus easily get torn writes.

I think this would need to use 128-byte atomics

vchuravy · 2025-11-16T18:18:55Z

ext/AtomixCUDAExt.jl

+# Complex atomic operations - separate atomics on real and imaginary parts
+# This works for operations that decompose component-wise (+, -, right)
+# Note: This provides per-component atomicity, not full Complex atomicity
+# (other threads may observe intermediate states, but final result is correct)
+@inline function _cuda_atomic_modify!(ptr::Core.LLVMPtr{Complex{T},A}, op::OP, x::Complex{T}) where {T<:Union{Float32,Float64},A,OP}


Same here, you are not gurantueed that a user is only using one kind of atomic operation on a memory location.

(e.g. someone doing a mul for good measure).

vchuravy · 2025-11-16T18:19:53Z

How does C++ implement them (if at all)?

I think we need to guarantee full atomicity, and thus use a compare and swap loop on the byte value.

albertomercurio · 2025-11-16T18:53:25Z

I'm not an expert of atomic operations. I just need it for my case, and with the help of Copilot just wrote this that make sense to me.

I have tested with the code of my first comment, and it seems to work. Do you think that that test is not enough?

vchuravy · 2025-11-16T20:51:25Z

No I don't think the test is sufficient, and you are right for some algorithms you might not care about "full atomicity" and partial atomicity might be sufficient.

For me the crux is that sofar all operations we currently have in Atomix promise full atomicity.

For a made up example, imagine a algorithm where even odd lane performs an atomic addition and every even lane performs an atomic multiplication.

That's a weird thing to do, and I don't know of any place this comes up, but that is besides the point. With this interface we are trying to implement a general set of operations, with consistent semantics, and I much prefer an error than to chase down memory semantics bugs later.

albertomercurio force-pushed the add-complex-support branch from 00e703d to 9abbe99 Compare November 15, 2025 13:13

albertomercurio added 3 commits November 15, 2025 16:04

Add support to ComplexF32 for CUDA

59a7d44

Handle the real and imaginary parts separately

9dc7ce9

Add complex number support for atomic operations in Metal, OpenCL, an…

21df681

…d oneAPI extensions

vchuravy reviewed Nov 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add atomic operations support for Complex numbers #61

Add atomic operations support for Complex numbers #61

Uh oh!

albertomercurio commented Nov 11, 2025 •

edited

Loading

Uh oh!

vchuravy Nov 16, 2025

Uh oh!

vchuravy Nov 16, 2025

Uh oh!

vchuravy commented Nov 16, 2025

Uh oh!

albertomercurio commented Nov 16, 2025

Uh oh!

vchuravy commented Nov 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Note: This is NOT fully atomic (components updated separately)
		# but works for both ComplexF32 and ComplexF64

Add atomic operations support for Complex numbers #61

Are you sure you want to change the base?

Add atomic operations support for Complex numbers #61

Uh oh!

Conversation

albertomercurio commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation

Testing

Uh oh!

vchuravy Nov 16, 2025

Choose a reason for hiding this comment

Uh oh!

vchuravy Nov 16, 2025

Choose a reason for hiding this comment

Uh oh!

vchuravy commented Nov 16, 2025

Uh oh!

albertomercurio commented Nov 16, 2025

Uh oh!

vchuravy commented Nov 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

albertomercurio commented Nov 11, 2025 •

edited

Loading