Skip to content

Conversation

@tmc
Copy link
Contributor

@tmc tmc commented Oct 30, 2025

What issue is this addressing?

Fixes ARM64 struct argument corruption when field alignment causes register boundary crossing.

What type of issue is this addressing?

bug

What this PR does | solves

Fixes #359

This fixes a critical bug in ARM64 struct argument passing where field values were being corrupted due to improper state management during register flushes.

Problem

When passing structs with multiple small integer fields (e.g., FourInt32s{1, 2, 3, 4}), field values could be corrupted when struct packing crossed register boundaries. For example, the value 4 was being corrupted to 6.

Root Cause

In struct_arm64.go, when field alignment caused shift >= 64, the code flushed the current register but failed to reset the accumulator (val) and classification (class) state. This caused subsequent fields to OR with stale data from previous fields.

Solution

Added two lines in struct_arm64.go:120-121 to reset state after register flush:

val = 0           // Reset register value
class = _NO_CLASS // Reset classification

Testing

  • Added FourInt32s test case that reproduces the bug
  • All tests pass on Darwin ARM64, Darwin AMD64, Linux ARM64, and Linux AMD64

Adds test coverage for struct arguments containing four int32 fields to
validate proper struct packing and argument marshalling. The test ensures
that all four int32 values are correctly passed to the C function and can
be retrieved and formatted as expected.

This test case helps verify the Darwin ARM64 struct packing improvements
handle multiple small integer fields within a single struct correctly.
@TotallyGamerJet
Copy link
Collaborator

Let’s separate the fix from the adding Linux struct support. That way it’s better for cherry-picking

Fix a critical bug in ARM64 struct argument packing where the register
value (val) and class were not being reset after flushing due to
alignment requirements.

When packing struct fields into registers, if a field's alignment
causes shift >= 64, the current register is flushed. However, the
code was not resetting 'val' and 'class' after the flush, causing
subsequent fields to be ORed with stale data from previous fields.

Example bug with FourInt32s{1, 2, 3, 4}:
- Fields 0,1 packed: val = 0x0000000200000001
- Flush at field 2 due to shift >= 64
- BUG: val still contains 0x0000000200000001
- Field 2 packs: val |= 3 becomes 0x0000000200000003 (should be 0x03)
- Field 3 packs: val |= (4<<32) becomes 0x0000000400000003
- Result: field 3 = 6 instead of 4 (bit 1 from field 1 leaked)

This fix ensures val and class are properly reset after each flush,
preventing data corruption between register boundaries.

Fixes the FourInt32s test case that was failing with f3=6 instead of f3=4.
@tmc tmc changed the title Enable linux structs, fix arm64 corruption Fix arm64 argument corruption on alignment flush Oct 30, 2025
@tmc tmc force-pushed the enable-linux-structs-fix-arm64-corruption branch from d42a173 to e116003 Compare October 30, 2025 20:38
@tmc
Copy link
Contributor Author

tmc commented Oct 30, 2025

Let’s separate the fix from the adding Linux struct support. That way it’s better for cherry-picking

done, opened #361 separately.

#include <stdlib.h>

char* FourInt32s(struct FourInt32s s) {
char* result = malloc(64);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care to free this on the go side?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like for tests this isn't really needed (and keeps the go side simple).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to be OK to not care about memory leaks in the tests, but as we have already done by passing buffers from the Go side in other tests, what about following the same way for consistency?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't care then it doesn't matter to me

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we could just or all the numbers instead and use arguments that all have a different bit so 1,2,4,8 then no string needed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, getting results by integer pointers is better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i did it this way because it's easier to extend to a variety of types across the call boundary using the same pattern.

@hajimehoshi
Copy link
Member

Write an issue number this PR works on.

};

#include <stdio.h>
#include <stdlib.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#include should be at the top

#include <stdlib.h>

char* FourInt32s(struct FourInt32s s) {
char* result = malloc(64);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to be OK to not care about memory leaks in the tests, but as we have already done by passing buffers from the Go side in other tests, what about following the same way for consistency?

@tmc
Copy link
Contributor Author

tmc commented Oct 31, 2025

could we go the other way and use this (IMO) simpler pattern in the other places?

@hajimehoshi
Copy link
Member

could we go the other way and use this (IMO) simpler pattern in the other places?

Are you talking about memory allocation? Please reply this to the comment...

(IMO we should use just pointer values for integers btw. If strings were involed, we could consider a simpler way)

@tmc
Copy link
Contributor Author

tmc commented Oct 31, 2025

could we go the other way and use this (IMO) simpler pattern in the other places?

Are you talking about memory allocation? Please reply this to the comment...

github ui isn't/wasn't allowing an in-line response for some reason, this was referring to the buffer return suggestion.

(IMO we should use just pointer values for integers btw.

I'm open to whatever, was just thinking ahead about how to have common test approaches for a variety of argument structures passed across the boundary, I figured comparing string reprs between the two would provide an easy to implement and easy to compare pattern.

@tmc tmc force-pushed the enable-linux-structs-fix-arm64-corruption branch from 1f29d17 to a859792 Compare October 31, 2025 16:08
Simplify the FourInt32s test by changing it from string formatting to
simple arithmetic sum, making it easier to verify correctness and debug
potential issues.

Update Array4Chars test to use simpler positive values (1,2,4,8) instead
of mixed positive/negative values, improving test clarity while maintaining
the same validation coverage.
Update FourInt32s test case to use a mix of positive and negative values
(100, -50, 25, -75) instead of simple sequential positive values. This
provides better test coverage by exercising the struct packing logic with
a wider range of integer values, including negative numbers, while
maintaining the same arithmetic validation approach.

The enhanced test values help ensure the ARM64 register corruption fix
works correctly across different value ranges and sign combinations.
Copy link
Member

@hajimehoshi hajimehoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@TotallyGamerJet Please take a look

Copy link
Collaborator

@TotallyGamerJet TotallyGamerJet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

@hajimehoshi hajimehoshi merged commit 25e25ad into ebitengine:main Oct 31, 2025
41 checks passed
@tmc tmc deleted the enable-linux-structs-fix-arm64-corruption branch October 31, 2025 18:10
hajimehoshi pushed a commit that referenced this pull request Oct 31, 2025
Fix a critical bug in ARM64 struct argument packing where the register
value (val) and class were not being reset after flushing due to
alignment requirements.

When packing struct fields into registers, if a field's alignment
causes shift >= 64, the current register is flushed. However, the
code was not resetting 'val' and 'class' after the flush, causing
subsequent fields to be ORed with stale data from previous fields.

Example bug with FourInt32s{1, 2, 3, 4}:
- Fields 0,1 packed: val = 0x0000000200000001
- Flush at field 2 due to shift >= 64
- BUG: val still contains 0x0000000200000001
- Field 2 packs: val |= 3 becomes 0x0000000200000003 (should be 0x03)
- Field 3 packs: val |= (4<<32) becomes 0x0000000400000003
- Result: field 3 = 6 instead of 4 (bit 1 from field 1 leaked)

This fix ensures val and class are properly reset after each flush,
preventing data corruption between register boundaries.

Closes #359
tmc added a commit to tmc/purego that referenced this pull request Nov 3, 2025
)

Fix a critical bug in ARM64 struct argument packing where the register
value (val) and class were not being reset after flushing due to
alignment requirements.

When packing struct fields into registers, if a field's alignment
causes shift >= 64, the current register is flushed. However, the
code was not resetting 'val' and 'class' after the flush, causing
subsequent fields to be ORed with stale data from previous fields.

Example bug with FourInt32s{1, 2, 3, 4}:
- Fields 0,1 packed: val = 0x0000000200000001
- Flush at field 2 due to shift >= 64
- BUG: val still contains 0x0000000200000001
- Field 2 packs: val |= 3 becomes 0x0000000200000003 (should be 0x03)
- Field 3 packs: val |= (4<<32) becomes 0x0000000400000003
- Result: field 3 = 6 instead of 4 (bit 1 from field 1 leaked)

This fix ensures val and class are properly reset after each flush,
preventing data corruption between register boundaries.

Closes ebitengine#359
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

darwin/arm64 struct corruption for homogeneous int32 structs

3 participants