Conversation
|
Note, this appears to be slower than the default ordering, it will remain a draft PR until we figure out why. |
The reason it was slower is because the So, using this command as an example: This is the size and modes we get for spatial scaling to 4 nodes on Perlmutter. The ordering parameter tells it to do x,y,z first, then repartition (R2), and then do time. But for this ordering:
During my earlier testing, I wasn't thinking of R1 and R4 at all, and R2 and R3 are small compared to those. The ordering parameters work, but the input partitioning needs to be adjusted at the same time. |
Allow specifying the order that the FFTs will be performed in. This allows optimization based on which data dimensions will be more restricted than others.
Add command line parameters to
bench.pyto allow specifying which FFT dimensions should be performed before, and after, the distdl repartition.For example, consider processing all of the spatial dimensions before the repartition, and process the time dimension after.
To do this, pass one of the following to bench.py:
--fft-order-before 2 3 4,--fft-order-after 5,--fft-order-before 2 3 4 --fft-order-after 5For 4-dimensional data, these all mean the same thing. (If one side is omitted, it will be calculated as the complement of the other.)
If you don't specify either of these parameters, you get the same default as before: do 4 and 5, then repartition, then do 2 and 3. Since the actual ffts count downward, this default causes the time dimension to be processed first. If the time dimension has more modes per data element, compared to the other dimensions, then this will be inefficient.
The fft dimension settings are added to the benchmark output filenames, so multiple experiments can be run and the results will be kept separate. (These filenames are getting pretty long, though!)
TODO:
gen_scripts.pyjob sizes as neededxyzprocessing but some ranks are unneeded for the time/weight part