Skip to content

Conversation

@jakobnissen
Copy link

@jakobnissen jakobnissen commented Oct 6, 2025

Fix one use of GC.@preserve and add TODOs where an invalid pointer escapes GC.@preserve.

@quinnj : Retaining a pointer beyond a GC.@preserve is undefined behaviour in Julia, and I've seen at least one case where it causes a real miscompilation.
I've opted not to fix the use of parsestring in this PR since that's a larger refactoring that you're best equipped to do. But I've tried to find all the places in the codebase where it's used.

Fix one use of GC.@preserve and add TODOs where an invalid pointer escapes
GC.@preserve.
@codecov
Copy link

codecov bot commented Oct 6, 2025

Codecov Report

❌ Patch coverage is 85.71429% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 89.86%. Comparing base (3f7d15f) to head (ea0c4d3).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
src/lazy.jl 85.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #391      +/-   ##
==========================================
- Coverage   89.89%   89.86%   -0.03%     
==========================================
  Files           7        7              
  Lines        1316     1322       +6     
==========================================
+ Hits         1183     1188       +5     
- Misses        133      134       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@quinnj
Copy link
Member

quinnj commented Oct 6, 2025

Thanks for the PR! I'll take a look.

return if codeunit(y) == UInt8
x.len == sizeof(y) || return false
ref = Base.cconvert(Ptr{UInt8}, y)
cmp = GC.@preserve ref ccall(:memcmp, Cint, (Ptr{UInt8}, Ptr{UInt8}, Csize_t), x.ptr, Base.unsafe_convert(Ptr{UInt8}, ref), x.len)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manual preserve here seems to be exactly equivalent to simply passing y to ccall directly.

@quinnj
Copy link
Member

quinnj commented Oct 7, 2025

In each case of an added #TODO, we're specifically taking pointer(buf, pos) which is stored in a PtrString for a very short amount of time, i.e. a PtrString never outlives the scope in which it was created. In each of these cases, the LazyValue is an argument to the function containing the scope where the PtrString lives, which is an immutable struct with a buf::Vector{UInt8} field. It's my understanding that function arguments are GC-preserved/rooted, and thus ensure the LazyValue and it's buf field will survive until the end of the function body, which in turn would ensure the scope of the PtrString usage is also valid.

In perhaps simpler terms, very early in our parsing entrypoints (JSON.lazy, JSON.parse, etc) we create a JSON.LazyValue with an immutable buf field, which is guaranteed to be valid until the end of the entrypoint call. This ensuring that buf is never modified and always live == PtrString usage is safe, as long as its usage is indeed always temporary to its scope (we convert it earlier than we necessarily need to in some places to ensure it never "escapes").

@yuyichao
Copy link
Contributor

yuyichao commented Oct 7, 2025

It's my understanding that function arguments are GC-preserved/rooted

no this is not the case. (And especially so due to inlining)

@yuyichao
Copy link
Contributor

yuyichao commented Oct 7, 2025

In perhaps simpler terms, very early in our parsing entrypoints (JSON.lazy, JSON.parse, etc) we create a JSON.LazyValue with an immutable buf field, which is guaranteed to be valid until the end of the entrypoint call. This ensuring that buf is never modified and always live == PtrString usage is safe, as long as its usage is indeed always temporary to its scope (we convert it earlier than we necessarily need to in some places to ensure it never "escapes").

otoh, assuming these are true, it is perfectly valid to ensure the liveness of the object you’ve taken pointer from at the top level and rely on that for all the lower level users, if it is more convenient to do so. However, the only way to guarantee liveness locally outside of a ccall is by using gc preserve.

@quinnj
Copy link
Member

quinnj commented Oct 7, 2025

In perhaps simpler terms, very early in our parsing entrypoints (JSON.lazy, JSON.parse, etc) we create a JSON.LazyValue with an immutable buf field, which is guaranteed to be valid until the end of the entrypoint call. This ensuring that buf is never modified and always live == PtrString usage is safe, as long as its usage is indeed always temporary to its scope (we convert it earlier than we necessarily need to in some places to ensure it never "escapes").

otoh, assuming these are true, it is perfectly valid to ensure the liveness of the object you’ve taken pointer from at the top level and rely on that for all the lower level users, if it is more convenient to do so. However, the only way to guarantee liveness locally outside of a ccall is by using gc preserve.

Ok, so you're saying if we do a GC.@preserve buf from our entrypoint functions, that would be a straightfoward way to guarantee liveness for the lifetime of the call. That sounds reasonable to me.

@quinnj
Copy link
Member

quinnj commented Oct 10, 2025

More complete implementation in #395

@quinnj quinnj closed this Oct 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants