-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8366990: C2: Compilation hits the memory limit when verifying loop opts in Split-If code #27731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
👋 Welcome back bmaillard! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
|
@benoitmaillard The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice summary and solution! I have a few comments but otherwise, the fix looks good to me.
I guess it's a discussion for another time if we also want to improve the verification time somehow. But that should not block this PR.
test/hotspot/jtreg/compiler/loopopts/TestVerifyLoopOptimizationsHitsMemLimit.java
Outdated
Show resolved
Hide resolved
| * -XX:CompileCommand=compileonly,compiler.loopopts.TestVerifyLoopOptimizationsHitsMemLimit::test | ||
| * -XX:-TieredCompilation -Xcomp -XX:CompileCommand=dontinline,*::* | ||
| * -XX:+StressLoopPeeling -XX:PerMethodTrapLimit=0 -XX:+VerifyLoopOptimizations | ||
| * -XX:StressSeed=1870557292 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to remove the stress seed since it might not trigger anymore in later builds. Usually, we add a run with a fixed stress seed and one without but since this test requires to do just some verification work, I would suggest to not add two runs but only one without fixed seed.
Another question: How close are we to hit the default the memory limit with this test? With your fix it probably consumes not much memory anymore. I therefore suggest to add MemLimit as additional flag with a much smaller value to be sure that your fix works as expected (you might need to check how low we can choose the limit to not run into problems in higher tiers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to reduce the test further using a memory limit of 100M (approximately 10 times less than the default) and a shorter timeout with creduce. Compilation of the new test method with a fast debug build now takes an average of 1.22 s over 100 runs according to -XX:+CITime.
With the decrease compilation time I think it now reasonable to have two runs (one with the stress seed, one without). Let me know if you think otherwise!
test/hotspot/jtreg/compiler/loopopts/TestVerifyLoopOptimizationsHitsMemLimit.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Christian Hagedorn <[email protected]>
Co-authored-by: Damon Fenacci <[email protected]>
Co-authored-by: Christian Hagedorn <[email protected]>
|
I have made the following (significant) changes that are ready for review:
|
This PR prevents the C2 compiler from hitting memory limits during compilation when using
-XX:+StressLoopPeelingand-XX:+VerifyLoopOptimizationsin certain edge cases. The fix addresses an issue where theciEnvarena grows uncontrollably due to the high number of verification passes, a complex IR graph, and repeated field accesses leading to unnecessary memory allocations.Analysis
This issue was initially detected with the fuzzer. The original test from the fuzzer was reduced
and added to this PR as a regression test.
The test contains a switch inside a loop, and stressing the loop peeling results in
a fairly complex graph. The split-if optimization is applied agressively, and we
run a verification pass at every progress made.
We end up with a relatively high number of verification passes, with each pass being
fairly expensive because of the size of the graph.
Each verification pass requires building a new
IdealLoopTree. This is quite slow(which is unfortunately hard to mitigate), and also causes inefficient memory usage
on the
ciEnvarena.The inefficient usages are caused by the
ciInstanceKlass::get_field_by_offsetmethod.At every call, we have
ciEnvarena to store the returnedciFieldciFieldresults in a call tociObjectFactory::get_symbol, which:ciSymbolon theciEnvarena at every call (when not found invmSymbols)_symbolsarrayThe
ciEnvobjects returned byciInstanceKlass::get_field_by_offsetare only used once, tocheck if the
BasicTypeof a static field is a reference type.In
ciObjectFactory, the_symbolsarray ends up containg a large number of duplicates for certain symbols(up to several millions), which hints at the fact that
ciObjectFactory::get_symbolshould not be calledrepeatedly as it is done here.
The stack trace of how we get to the
ciInstanceKlass::get_field_by_offsetis shown below:Because the
ciEnvarena is not fred up between verification passes, it quickly fills up and hitsthe memory limit after about 30s of execution in this case.
Proposed fix
As explained in the previous section, the only point of the
ciInstanceKlass::get_field_by_offsetcall is to obtain the
BasicTypeof the field. By inspecting carefully what this method does,we notice that the field descriptor
fdalready contains the type information we need.We do not actually need all the information embedded in the
ciFieldobject.Hence we can simply create a more specialized version of
ciInstanceKlass::get_field_type_by_offsetthat directly returns the
BasicTypewithout creating theciField. This happens toavoid the three memory allocations mentioned before.
After this change, the memory usage of the
ciEnvarena stays constant across verificationpasses.
Testing
Thank you for reviewing!
Progress
Issue
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27731/head:pull/27731$ git checkout pull/27731Update a local copy of the PR:
$ git checkout pull/27731$ git pull https://git.openjdk.org/jdk.git pull/27731/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 27731View PR using the GUI difftool:
$ git pr show -t 27731Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27731.diff
Using Webrev
Link to Webrev Comment