-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8370176: Mixed mode jhsdb jstack cannot unwind call stack with -Xcomp #27885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back ysuenaga! A progress list of the required criteria for merging this PR into |
|
@YaSuenag This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 6 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
|
/issue JDK-8370176 |
|
@YaSuenag The primary solved issue for a PR is set through the PR title. Since the current title does not contain an issue reference, it will now be updated. |
Webrevs
|
| * @test | ||
| * @bug 8370176 | ||
| * @requires vm.hasSA | ||
| * @requires os.family == "linux" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do Windows and OSX have a similar problem that should be fixed also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This problem is in mixed mode (PStack) only, thus we need to skip OSX because you mentioned mixed mode is not supported on OSX.
In Windows, I'm not sure, but I guess we need to consider UNWIND_INFO to unwind call frames correctly like DWARF in Linux, however it hasn't done yet. Thus we can think mixed mode is not supported in Windows too, so I didn't add Windows here.
https://learn.microsoft.com/cpp/build/exception-handling-x64
Actually I could not see all of stacks as following in mixed mode. It works in normal mode (without --mixed) of course. (I tested it on Windows 11 x64, upstream JDK built by VS 2022)
----------------- 13 -----------------
"Reference Handler" #15 daemon prio=10 tid=0x00000207280b9f70 nid=12684 waiting on condition [0x000000aaf6aff000]
java.lang.Thread.State: RUNNABLE
JavaThread state: _thread_blocked
0x00007fffa6b45844 ntdll!NtWaitForAlertByThreadId + 0x14
0x00000000ffffffff ????????
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind you that this new test seems to fail even on linux systems without pstack. This is happening on both of my AMD64 machine running Debian 12 and ARM64 machine running Ubuntu 22.04.4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share .jtr file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. This is what I got on my amd64 machine:
$ make test TEST="serviceability/sa/TestJhsdbJstackMixedWithXComp.java"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this PR to check glibc version in TestJhsdbJstackMixedWithXComp.java added by this PR. It skips the test on Ubuntu 22.04, OTOH it works on Fedora 43. It is expected.
I attempted to add this check to SATestUtils at first, but it seems to be difficult because native access have to be allowed all of SATestUtils users - the impact is too significant.
I will file another issue to apply this check to other tests of jhsdb jstack --mixed user after this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated testcase again. It is more scalable for other mixed jstack tests. It works fine on both Fedora 43 and Ubuntu 22.04 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still fails on other distros like Debian 13 on AMD64. It has glibc version 2.41. Attached please fine the JTR file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RealFYang Thanks a lot for sharing JTR file!
I could find out the problem, and I fixed. We need to handle RSP more carefully when parsing DWARF. This PR works fine on Fedora 43 and Ubuntu 22. I believe it works on your Debian/Ubuntu.
But I think the failure what you saw on AArch64 is caused by problem(s) on stack unwinder for Linux AArch64, it is different from AMD64. Currently DWARF is supported on Linux AMD64 only, other platforms would attempt to unwind relies on base pointer. It is traditional, but it might not work on DWARF based binaries. I saw DWARF is contained in AArch64 binary in Fedora Rawhide for AArch64 at least. So the test added by this PR might not work on other platforms includes AArch64, RISC-V. Thus I removed them from @requires in the test. I think we can enable them if the unwinder (e.g. LinuxAARCH64CFrame.java) supports DWARF, but it is not a scope of this bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the latest version now works on my Debian 13 AMD64 machine. Thanks for finding it out. It's good to know the difference.
plummercj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good. Thanks for fixing this.
|
@YaSuenag : |
|
@RealFYang Thanks a lot for sharing a patch for RISC-V! Merged to this PR. |
|
@plummercj Thanks a lot for your review! I'm trying to fix mixed mode on Windows. I think we can unwind native stacks with this change, but it is not enough for Java frames - I think we can see all of them if we modify after this PR. |
|
@plummercj @RealFYang |
Hi, Sorry, as a co-author I don't think I can review this. BTW: Do you mind listing me as a co-author? Thanks. |
| System.out.println(stdout); | ||
| System.err.println(out.getStderr()); | ||
|
|
||
| out.stderrShouldBeEmptyIgnoreDeprecatedWarnings(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might want you to change this to stderrShouldBeEmptyIgnoreVMWarnings(). We have issues with our internal testing when using -XX:+UseLargePages that sometimes results in a VM warning on stderr that causes this (and some other tests) to fail. We are still deciding on the best approach to fixing this, but I think the solution is most likely to be switching to stderrShouldBeEmptyIgnoreDeprecatedWarnings(). Hopefully I'll know within the next day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will update with the right way. Let me know what should I do when you know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the test to use stderrShouldBeEmptyIgnoreVMWarnings(). @plummercj let me know if I should make more changes.
|
/contributor add @RealFYang |
|
@YaSuenag |
| return (nextCFA != null) && | ||
| !nextCFA.lessThan(context.getRegisterAsAddress(AMD64ThreadContext.RSP)); | ||
| private boolean isValidFrame(Address nextCFA, boolean isNative) { | ||
| // CFA should not be null even if it is Java frame. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment means a Java interpreter frame need not have a different CFA/frame pointer?
"native" means we found DWARF. Non-native means Java interpreted or Java compiled.
Java frames just need a non-null CFA.
// CFA should never be null.
// nextCFA must be greater than current CFA, if frame is native.
// Java interpreter frames can share the CFA (frame pointer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. I updated the comment. Thanks!
|
@YaSuenag this pull request can not be integrated into git checkout jhsdb-jstack-bp
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
|
I resolved merge confliction, and it passed all of serviceability/sa tests on Linux AMD64. |
kevinjwalls
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, spent quite a while going over this yesterday and think it's good.
Yes was waiting for the conflict with 8369994, that looks like it merged in OK. 8-)
|
@plummercj Can you approve again this PR? Thanks! |
plummercj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
|
Thanks a lot everyone involved in this PR! /integrate |
|
Going to push as commit 045018d.
Your commit was automatically rebased without conflicts. |
jhsdb jstack --mixedwould not work when attaches to the process runs with-Xcomp.It has been reported by @pchilano in #27728. You can reproduce the problem with Test.java (attached JBS). You can see following stack.
Thread.sleepNanos0is the bottom stack, but actually it has more call frames. You can see them with-XX:+PreserveFramePointer.Java frame might be use the register for frame pointer (
RBPin AMD64) as general purpose register, so SA cannot rely it in stack unwinding.hs_err log has mixed stack trace as "Native frames", it would be unwinded by
NativeStackPrinterin HotSpot, and it works as mixed mode.NativeStackPrinterusesframe::next_frame()to find sender frame regardless whether Java frame or C frame, and it leverages sender FP/PC to create sender frame. On the other hand, SA separates CFrame and VFrame to unwind in mixed mode jstack, so sender FP/PC would not propagate to CFrame, thus the frame located at bottom of Java frame might not be shown.It is difficult to unify unwinder in
PStackin SA, so it would be reasonable to propagate sender FP/PC to the sender of CFrame.Progress
Issue
Reviewers
Contributors
<[email protected]>Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27885/head:pull/27885$ git checkout pull/27885Update a local copy of the PR:
$ git checkout pull/27885$ git pull https://git.openjdk.org/jdk.git pull/27885/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 27885View PR using the GUI difftool:
$ git pr show -t 27885Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27885.diff
Using Webrev
Link to Webrev Comment