Conversation
Fuseable transactions still need to be fused.
source/SAMRAI/tbox/ExecutionPolicy.h
Outdated
| using ReductionPolicy = RAJA::cuda_reduce; | ||
|
|
||
| using WorkGroupPolicy = RAJA::WorkGroupPolicy< | ||
| RAJA::cuda_work_async<1024>, |
There was a problem hiding this comment.
Can you provide a way to set the workgroup block size in case people run into cuda linking issues?
source/SAMRAI/tbox/Schedule.C
Outdated
| } | ||
| d_recv_fuser->launch(); | ||
| #if defined(HAVE_RAJA) | ||
| parallel_synchronize(); |
There was a problem hiding this comment.
Is this synchronization between fused and non-fused necessary?
There was a problem hiding this comment.
If it is necesary, what about if d_recv_sets_fuseable[sender] is empty?
There was a problem hiding this comment.
It shouldn't be needed if it's empty and the above loop is a no-op. I think also the launch() call is unnecessary in that case.
source/SAMRAI/hier/PatchData.h
Outdated
| const PatchData& src, | ||
| const BoxOverlap& overlap) = 0; | ||
|
|
||
| virtual void | ||
| copy( | ||
| const PatchData& src, | ||
| const BoxOverlap& overlap, | ||
| tbox::KernelFuser& fuser); |
There was a problem hiding this comment.
As it stands I'll have to implement both if I want to use fusion. I guess I can have one implementation for both that takes the fuser pointer and use or not use it under an abstraction layer to keep things single source. I'll need to use some macros to maintain support for older versions of samrai but that was pretty much inevitable.
source/SAMRAI/hier/PatchData.h
Outdated
| packStream( | ||
| tbox::MessageStream& stream, | ||
| const BoxOverlap& overlap, | ||
| tbox::KernelFuser& fuser); |
There was a problem hiding this comment.
Yes, looks like this should be const.
|
Could you add a feature availability macro in your config file so its easy to check if the fuser exists? |
in pdat classes that take KernelFuser.
config/SAMRAI_config.h.cmake.in
Outdated
|
|
||
|
|
||
| #ifdef HAVE_RAJA | ||
| #define HAVE_KERNEL_FUSER |
source/SAMRAI/tbox/Schedule.C
Outdated
| } | ||
| TBOX_ASSERT(mi->first == send_coms[icom].getPeerRank()); | ||
| #if defined(HAVE_RAJA) | ||
| parallel_synchronize(); |
and add needed guards for non-cuda builds
have been launched
StagedKernelFusers
launched kernels.
Uh oh!
There was an error while loading. Please reload this page.