AoS / SoA Copy Benchmarks, main branch (2024.09.27.)#297
Open
krasznaa wants to merge 4 commits intoacts-project:mainfrom
Open
AoS / SoA Copy Benchmarks, main branch (2024.09.27.)#297krasznaa wants to merge 4 commits intoacts-project:mainfrom
krasznaa wants to merge 4 commits intoacts-project:mainfrom
Conversation
f828c4b to
cbc49a0
Compare
So that it would be easier to set up the CUDA, HIP and SYCL tests as a next step.
cbc49a0 to
c971b34
Compare
Simply copying the current CUDA benchmark code, with all its imperfections.
c971b34 to
288f9fc
Compare
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


While working on acts-project/traccc#712 yesterday, I was surprised to see how expensive it apparently is to copy a few megabytes of cell information from one host location to another. What I saw in NSight Systems was that doing a
vecmem::edm::host->vecmem::edm::bufferhost-to-host copy was very comparable to then doing avecmem::edm::buffer->vecmem::edm::bufferhost-to-device copy.As a reminder, the host-to-host step would seem to be useful to copy the entire payload of an SoA container in one step, instead of copying its payload column-by-column.
But as it turns out, the overhead of copying a cell collection in 5 steps instead of one (a traccc cell has only 5 variables) is negligible compared to how long it takes to copy a few megabytes from one place to another in host memory. 😕
So in this PR I want to see exactly how copying the same sort of EDM, once in AoS and then in SoA form, would compare with each other. Right now, with only the host copies existing, I get:
Many aspects of these results I believe I understand. But I'm really not sure why the copy speed drops as it does for large sizes. 😕
In any case, I plan to continue the investigation...