Skip to content

Add support for C26 atomic reductions (without compiler mappings)#985

Open
ThomasHaas wants to merge 9 commits intodevelopmentfrom
atomic-modify-write
Open

Add support for C26 atomic reductions (without compiler mappings)#985
ThomasHaas wants to merge 9 commits intodevelopmentfrom
atomic-modify-write

Conversation

@ThomasHaas
Copy link
Copy Markdown
Collaborator

@ThomasHaas ThomasHaas commented Feb 12, 2026

  • I added a header c26.h with the new atomic reduction operations of C26.
  • I implemented all of them except min and max min/max are supported now, but only the signed versions!
  • There is also some support for C litmus style versions. @hernan-poncedeleon added them and I don't know how well they work right now.

These atomics generate a rmw-pair of events just like a standard fetch_op atomic, but add the Noreturn tag to both of them (naming follows LKMM's non-returning atomics).
There is no compilation scheme to hardware targets yet, so code has to be verified with --target=c11 (default).

What needs to be done is to relax the memory models of interest: right now atomic_op and atomic_fetch_op provide the same synchronization semantics. EDIT: Although the memory models should probably be adapted, the fact that we currently model the load part of atomic_store_op as a plain load (not even relaxed) makes it weaker than a atomic_fetch_op in terms of ordering.

@hernanponcedeleon
Copy link
Copy Markdown
Owner

@graymalkin this branch should have everything you need to play around with the model

@ThomasHaas
Copy link
Copy Markdown
Collaborator Author

FYI, atomic_store_min/max are always the signed versions for now.

@graymalkin
Copy link
Copy Markdown

Thanks, I'll check it out!

@ThomasHaas ThomasHaas changed the title [DRAFT] Add support for C26 atomic reductions Add support for C26 atomic reductions (without compilation) Feb 18, 2026
@ThomasHaas ThomasHaas changed the title Add support for C26 atomic reductions (without compilation) Add support for C26 atomic reductions (without compiler mappings) Feb 18, 2026
@hernanponcedeleon
Copy link
Copy Markdown
Owner

Code-wise I think this one is ready to merge. I will wait a few days to see if @graymalkin or @gonzalobg have comments about the memory model part (especially if it makes sense to mark the read part of the reduction as atomic) or @mmalcomson reports any issues when trying the code.

@ThomasHaas
Copy link
Copy Markdown
Collaborator Author

With #986 merged, we could in principle add compiler mappings for atomic reductions to armv8. At least the obvious one's like store_add(... RLX) -> STADD and store_add_(... REL) -> STADDL. For SC, it would not be so clear.
I cannot imagine that any real C memory model would require the mapping to be stronger than that.

@hernanponcedeleon hernanponcedeleon force-pushed the atomic-modify-write branch 2 times, most recently from 7be51db to 1ca78bc Compare February 26, 2026 07:35
Local localOp = newLocal(dummyReg, expressions.makeIntBinary(dummyReg, e.getOperator(), e.getOperand()));
RMWStore store = newRMWStoreWithMo(load, address, dummyReg, Tag.C11.storeMO(mo));

load.addTags(C11.ATOMIC, Tag.C11.NORETURN); // Note that the load has no mo, but is still atomic!
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with visitAtomicFetchOp I would rather use

Load load = newRMWLoadWithMo(dummyReg, address, Tag.C11.loadMO(mo));

and rather than getting the expected ordering guarantees "by chance" as it currently happens for rc11,
let the model explicitly state if NORETURN events should provide order or not.

It also feels strange to have an atomic event with no memory order.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like these consistency arguments... those are different operations. Tag.C11.loadMO(mo) will just be RLX or SC because you cannot specify ACQ/ACQ_REL in the first place.
I think the only really sensible options are: the load has no mo, simply because it shouldn't exist in the first place, or the load has the same mo/tags as the store and the WMM removes the tags.
Anything inbetween seems arbitrary to me.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current solution of hardcoding the atomic tag seems equally arbitrary.

I guess what you are proposing is to completely get rid of Tag.C11.loadMO/storeMO) and simply used the mo from the parsing. This would require the memory model to do some "cleanup" as lkmm does, but then we can get rid of these loadMo/storeMO as we already did for lkmm in #893.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current solution of hardcoding the atomic tag seems equally arbitrary.

The atomic tag is not arbitrary, because the whole operation is an atomic one, even by name atomic_store_XYZ.
And if you look at what our compiler does:

boolean canRace = mo == null || mo.value().equals(C11.NONATOMIC);
e.addTags(canRace ? C11.NONATOMIC : C11.ATOMIC);

then every event must be tagged either way, and NONATOMIC is certainly more wrong than ATOMIC.

I guess what you are proposing is to completely get rid of Tag.C11.loadMO/storeMO) and simply used the mo from the parsing. This would require the memory model to do some "cleanup" as lkmm does, but then we can get rid of these loadMo/storeMO as we already did for lkmm in #893.

I proposed exactly that in #984 or rather suggested it as one possible way to go forward. I think rc11.cat might already adhere to that. That being said, for now, I just took the most natural solution given the current hardcoded one:

  • A load must be generated for data-flow modelling (no way around this)
  • The load must be ignored in data races. Marking it as atomic is natural as it is part of an atomic operation independent of its memory ordering.
  • The load should not provide any orderings -> both plain (no mo) and RLX seem reasonable. Plain is closer to capturing the idea of "the load should not exist" whereas RLX is closer to capturing the idea of "the load exists but it should not give orderings", which is (funnily enough) too much ordering :)

At the end of the day, I'm not the one who writes the C memory models and sets the expectation of what is assumed to happen implicitly and what is assumed to be done in the model.

@ThomasHaas ThomasHaas force-pushed the atomic-modify-write branch from db502d6 to 2a5fa5f Compare March 24, 2026 09:12
};
}

public static String intToMo(int i) {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this back?

Copy link
Copy Markdown
Collaborator Author

@ThomasHaas ThomasHaas Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used in our Intrinsics to get the correct memory ordering for the new atomic reductions from our custom c26 header. I think once LLVM supports those instructions natively, we won't need this anymore.

EDIT: I could move the mapping code into Intrinsics if you prefer.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could move the mapping code into Intrinsics if you prefer.

That might be better. Also, please add a TODO so we remember to get rid of this once LLVM supports the instructions.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once LLVM supports those instructions, we will get parser issues anyhow :). The code needs to change, so you cannot forget it really.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants