Skip to content

Conversation

@YaSuenag
Copy link
Member

@YaSuenag YaSuenag commented Oct 19, 2025

jhsdb jstack --mixed would not work when attaches to the process runs with -Xcomp.

It has been reported by @pchilano in #27728. You can reproduce the problem with Test.java (attached JBS). You can see following stack.

----------------- 646689 -----------------
"Thread-0" #24 prio=5 tid=0x00007f1cec18c890 nid=646689 waiting on condition [0x00007f1cd0158000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
   JavaThread state: _thread_blocked
0x00007f1cf3b7f462 __syscall_cancel_arch + 0x32
0x00007f1cf3b7375c __internal_syscall_cancel + 0x5c
0x00007f1cf3b766a8 ___pthread_cond_timedwait + 0x178
0x00007f1cf270e1e9 PlatformEvent::park_nanos(long) + 0x119
0x00007f1cf2005f4c JavaThread::sleep_nanos(long) + 0xfc
0x00007f1cf218789f JVM_SleepNanos + 0x28f
0x00007f1cdb95f299 java.lang.Thread.sleepNanos0(long) + 0x99 (Native method)

Thread.sleepNanos0 is the bottom stack, but actually it has more call frames. You can see them with -XX:+PreserveFramePointer.

----------------- 646841 -----------------
"Thread-0" #24 prio=5 tid=0x00007f4a0018c9e0 nid=646841 waiting on condition [0x00007f49e4fd7000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
   JavaThread state: _thread_blocked
0x00007f4a0aa29462 __syscall_cancel_arch + 0x32
0x00007f4a0aa1d75c __internal_syscall_cancel + 0x5c
0x00007f4a0aa206a8 ___pthread_cond_timedwait + 0x178
0x00007f4a0970e1e9 PlatformEvent::park_nanos(long) + 0x119
0x00007f4a09005f4c JavaThread::sleep_nanos(long) + 0xfc
0x00007f4a0918789f JVM_SleepNanos + 0x28f
0x00007f49ef961099 java.lang.Thread.sleepNanos0(long) + 0x99 (Native method)
0x00007f49e7f477b4 * java.lang.Thread.sleepNanos(long) bci:33 line:509 (Compiled frame)
0x00007f49e7f41a64 * java.lang.Thread.sleep(long) bci:25 line:540 (Compiled frame)
0x00007f49e7f4037c * Test.run() bci:3 line:6 (Compiled frame)
0x00007f49ef943328 * java.lang.Thread.runWith(java.lang.Object, java.lang.Runnable) bci:5 line:1487 (Compiled frame)
                        * java.lang.Thread.run() bci:19 line:1474 (Compiled frame)
0x00007f49ef3385fd <StubRoutines (initial stubs)>
0x00007f4a08fc247e JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, JavaThread*) + 0x4ce
0x00007f4a08fc2bb3 JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, JavaThread*) + 0x2d3
0x00007f4a08fc31bb JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, JavaThread*) + 0xab
0x00007f4a09185590 thread_entry(JavaThread*, JavaThread*) + 0xd0
0x00007f4a09004206 JavaThread::thread_main_inner() + 0x256
0x00007f4a09c66747 Thread::call_run() + 0xb7
0x00007f4a096fccc8 thread_native_entry(Thread*) + 0x128
0x00007f4a0aa20f54 start_thread + 0x2e4

Java frame might be use the register for frame pointer (RBP in AMD64) as general purpose register, so SA cannot rely it in stack unwinding.

hs_err log has mixed stack trace as "Native frames", it would be unwinded by NativeStackPrinter in HotSpot, and it works as mixed mode. NativeStackPrinter uses frame::next_frame() to find sender frame regardless whether Java frame or C frame, and it leverages sender FP/PC to create sender frame. On the other hand, SA separates CFrame and VFrame to unwind in mixed mode jstack, so sender FP/PC would not propagate to CFrame, thus the frame located at bottom of Java frame might not be shown.

It is difficult to unify unwinder in PStack in SA, so it would be reasonable to propagate sender FP/PC to the sender of CFrame.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8370176: Mixed mode jhsdb jstack cannot unwind call stack with -Xcomp (Bug - P4)

Reviewers

Contributors

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27885/head:pull/27885
$ git checkout pull/27885

Update a local copy of the PR:
$ git checkout pull/27885
$ git pull https://git.openjdk.org/jdk.git pull/27885/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27885

View PR using the GUI difftool:
$ git pr show -t 27885

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27885.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 19, 2025

👋 Welcome back ysuenaga! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 19, 2025

@YaSuenag This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8370176: Mixed mode jhsdb jstack cannot unwind call stack with -Xcomp

Co-authored-by: Fei Yang <[email protected]>
Reviewed-by: cjplummer, kevinw

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 6 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Oct 19, 2025

@YaSuenag The following label will be automatically applied to this pull request:

  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@YaSuenag
Copy link
Member Author

/issue JDK-8370176

@openjdk openjdk bot changed the title Mixed mode jhsdb jstack cannot unwind call stack with -Xcomp 8370176: Mixed mode jhsdb jstack cannot unwind call stack with -Xcomp Oct 19, 2025
@openjdk
Copy link

openjdk bot commented Oct 19, 2025

@YaSuenag The primary solved issue for a PR is set through the PR title. Since the current title does not contain an issue reference, it will now be updated.

@YaSuenag YaSuenag marked this pull request as ready for review October 19, 2025 06:59
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 19, 2025
@mlbridge
Copy link

mlbridge bot commented Oct 19, 2025

* @test
* @bug 8370176
* @requires vm.hasSA
* @requires os.family == "linux"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do Windows and OSX have a similar problem that should be fixed also?

Copy link
Member Author

@YaSuenag YaSuenag Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This problem is in mixed mode (PStack) only, thus we need to skip OSX because you mentioned mixed mode is not supported on OSX.

In Windows, I'm not sure, but I guess we need to consider UNWIND_INFO to unwind call frames correctly like DWARF in Linux, however it hasn't done yet. Thus we can think mixed mode is not supported in Windows too, so I didn't add Windows here.
https://learn.microsoft.com/cpp/build/exception-handling-x64

Actually I could not see all of stacks as following in mixed mode. It works in normal mode (without --mixed) of course. (I tested it on Windows 11 x64, upstream JDK built by VS 2022)

----------------- 13 -----------------
"Reference Handler" #15 daemon prio=10 tid=0x00000207280b9f70 nid=12684 waiting on condition [0x000000aaf6aff000]
   java.lang.Thread.State: RUNNABLE
   JavaThread state: _thread_blocked
0x00007fffa6b45844      ntdll!NtWaitForAlertByThreadId + 0x14
0x00000000ffffffff              ????????

Copy link
Member

@RealFYang RealFYang Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind you that this new test seems to fail even on linux systems without pstack. This is happening on both of my AMD64 machine running Debian 12 and ARM64 machine running Ubuntu 22.04.4.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share .jtr file?

Copy link
Member

@RealFYang RealFYang Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. This is what I got on my amd64 machine:

$ make test TEST="serviceability/sa/TestJhsdbJstackMixedWithXComp.java"

TestJhsdbJstackMixedWithXComp.jtr.txt

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this PR to check glibc version in TestJhsdbJstackMixedWithXComp.java added by this PR. It skips the test on Ubuntu 22.04, OTOH it works on Fedora 43. It is expected.
I attempted to add this check to SATestUtils at first, but it seems to be difficult because native access have to be allowed all of SATestUtils users - the impact is too significant.

I will file another issue to apply this check to other tests of jhsdb jstack --mixed user after this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated testcase again. It is more scalable for other mixed jstack tests. It works fine on both Fedora 43 and Ubuntu 22.04 .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still fails on other distros like Debian 13 on AMD64. It has glibc version 2.41. Attached please fine the JTR file.

TestJhsdbJstackMixedWithXComp.jtr.txt

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RealFYang Thanks a lot for sharing JTR file!

I could find out the problem, and I fixed. We need to handle RSP more carefully when parsing DWARF. This PR works fine on Fedora 43 and Ubuntu 22. I believe it works on your Debian/Ubuntu.

But I think the failure what you saw on AArch64 is caused by problem(s) on stack unwinder for Linux AArch64, it is different from AMD64. Currently DWARF is supported on Linux AMD64 only, other platforms would attempt to unwind relies on base pointer. It is traditional, but it might not work on DWARF based binaries. I saw DWARF is contained in AArch64 binary in Fedora Rawhide for AArch64 at least. So the test added by this PR might not work on other platforms includes AArch64, RISC-V. Thus I removed them from @requires in the test. I think we can enable them if the unwinder (e.g. LinuxAARCH64CFrame.java) supports DWARF, but it is not a scope of this bug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the latest version now works on my Debian 13 AMD64 machine. Thanks for finding it out. It's good to know the difference.

Copy link
Contributor

@plummercj plummercj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good. Thanks for fixing this.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 22, 2025
@RealFYang
Copy link
Member

@YaSuenag :
Hi, I tried the test on linux-riscv64 and seems this platform bears the same issue.
Would you mind adding this add-on fix for this platform please? Thanks.
8370176-riscv64.diff.txt

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Oct 23, 2025
@YaSuenag
Copy link
Member Author

@RealFYang Thanks a lot for sharing a patch for RISC-V! Merged to this PR.

@YaSuenag
Copy link
Member Author

@plummercj Thanks a lot for your review!

I'm trying to fix mixed mode on Windows. I think we can unwind native stacks with this change, but it is not enough for Java frames - I think we can see all of them if we modify after this PR.

----------------- 13 -----------------
"Reference Handler" #15 daemon prio=10 tid=0x000001a8df240270 nid=4800 waiting on condition [0x0000002207fff000]
   java.lang.Thread.State: RUNNABLE
   JavaThread state: _thread_blocked
0x00007fff725a5844      ntdll!NtWaitForAlertByThreadId + 0x14
0x00007fff7244c30b      ntdll!RtlSleepConditionVariableCS + 0x14b
0x00007fff6f6ca688      KERNELBASE!SleepConditionVariableCS + 0x38
0x00007ffed580c92d      jvm!PlatformMonitor::wait + 0x3d
0x00007ffed5796f3f      jvm!Monitor::wait + 0x15f
0x00007ffed5448310      jvm!JVM_WaitForReferencePendingList + 0xb0
0x000001a8ce87fa18      <interpreter> native method entry point (kind = native)
0x000001a8df240710              ????????
0x0000002207fffa38              ????????
0x000001a8df240270              ????????
0x000001a8e000bd98              ????????
0x000001a8ce87f39b      <interpreter> native method entry point (kind = native)
0xfffffffffffffff7              ????????
0x000000003e871a28              ????????
0x0000000000000003              ????????
0x000000003e2054f8              ????????

@YaSuenag
Copy link
Member Author

@plummercj @RealFYang
Could you review/approve this PR again?

@RealFYang
Copy link
Member

@plummercj @RealFYang Could you review/approve this PR again?

Hi, Sorry, as a co-author I don't think I can review this. BTW: Do you mind listing me as a co-author? Thanks.

System.out.println(stdout);
System.err.println(out.getStderr());

out.stderrShouldBeEmptyIgnoreDeprecatedWarnings();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might want you to change this to stderrShouldBeEmptyIgnoreVMWarnings(). We have issues with our internal testing when using -XX:+UseLargePages that sometimes results in a VM warning on stderr that causes this (and some other tests) to fail. We are still deciding on the best approach to fixing this, but I think the solution is most likely to be switching to stderrShouldBeEmptyIgnoreDeprecatedWarnings(). Hopefully I'll know within the next day.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will update with the right way. Let me know what should I do when you know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the test to use stderrShouldBeEmptyIgnoreVMWarnings(). @plummercj let me know if I should make more changes.

@YaSuenag
Copy link
Member Author

/contributor add @RealFYang

@openjdk
Copy link

openjdk bot commented Oct 29, 2025

@YaSuenag
Contributor Fei Yang <[email protected]> successfully added.

return (nextCFA != null) &&
!nextCFA.lessThan(context.getRegisterAsAddress(AMD64ThreadContext.RSP));
private boolean isValidFrame(Address nextCFA, boolean isNative) {
// CFA should not be null even if it is Java frame.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment means a Java interpreter frame need not have a different CFA/frame pointer?

"native" means we found DWARF. Non-native means Java interpreted or Java compiled.
Java frames just need a non-null CFA.

// CFA should never be null.
// nextCFA must be greater than current CFA, if frame is native.
// Java interpreter frames can share the CFA (frame pointer).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. I updated the comment. Thanks!

@openjdk
Copy link

openjdk bot commented Nov 1, 2025

@YaSuenag this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout jhsdb-jstack-bp
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Nov 1, 2025
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Nov 1, 2025
@YaSuenag
Copy link
Member Author

YaSuenag commented Nov 1, 2025

I resolved merge confliction, and it passed all of serviceability/sa tests on Linux AMD64.

Copy link
Contributor

@kevinjwalls kevinjwalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, spent quite a while going over this yesterday and think it's good.
Yes was waiting for the conflict with 8369994, that looks like it merged in OK. 8-)

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 1, 2025
@YaSuenag
Copy link
Member Author

YaSuenag commented Nov 2, 2025

@plummercj Can you approve again this PR? Thanks!

Copy link
Contributor

@plummercj plummercj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@YaSuenag
Copy link
Member Author

YaSuenag commented Nov 3, 2025

Thanks a lot everyone involved in this PR!

/integrate

@openjdk
Copy link

openjdk bot commented Nov 3, 2025

Going to push as commit 045018d.
Since your change was applied there have been 20 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Nov 3, 2025
@openjdk openjdk bot closed this Nov 3, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 3, 2025
@openjdk
Copy link

openjdk bot commented Nov 3, 2025

@YaSuenag Pushed as commit 045018d.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@YaSuenag YaSuenag deleted the jhsdb-jstack-bp branch November 3, 2025 14:26
@YaSuenag YaSuenag restored the jhsdb-jstack-bp branch November 5, 2025 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integrated Pull request has been integrated serviceability [email protected]

Development

Successfully merging this pull request may close these issues.

4 participants