-
-
Notifications
You must be signed in to change notification settings - Fork 123
Description
I am currently trying to profile the MPI communication of Trixi.jl, which uses T8code.jl, which provides julia bindings to the t8code meshing library.
Using nsys
, I noticed that MPI calls made by t8code are not captured (same with Extrae.jl
).
@vchuravy kindly pointed me to #444.
At the moment I am not able to solve this issue on my own, but maybe you have some ideas what I could try?
Here is how the issue can be reproduced:
Start with a Julia project, add MPI
and T8code
, get https://github.com/DLR-AMR/T8code.jl/blob/main/examples/t8_step2_uniform_forest.jl, and run nsys
:
mpiexecjl -n 2 nsys profile --trace=mpi --mpi-impl=mpich julia --project=. t8_step2_uniform_forest.jl
The trace contains MPI_Init_thread
and MPI_Finalize
only, which are the calls made through MPI.jl
(https://github.com/DLR-AMR/T8code.jl/blob/bd7525f9022bc9dbb6fa1f963acfc0fae3a3813e/examples/t8_step2_uniform_forest.jl#L108)
The same example is available as C++ code as well: https://github.com/DLR-AMR/t8code/blob/main/tutorials/general/t8_step2_uniform_forest.cxx
Running nsys
mpiexec -n 2 nsys profile --trace=mpi --mpi-impl=openmpi ./t8_step2_uniform_forest
I additionally see some MPI_Allreduce
and MPI_Allgather
calls in the trace.