Skip to content

H2yaml hip#487

Merged
TApplencourt merged 5 commits intodevelfrom
h2yaml_hip
Apr 6, 2026
Merged

H2yaml hip#487
TApplencourt merged 5 commits intodevelfrom
h2yaml_hip

Conversation

@TApplencourt
Copy link
Copy Markdown
Collaborator

No description provided.

@TApplencourt
Copy link
Copy Markdown
Collaborator Author

Checked with collen and hiplz

THAPI: Trace location: /home/applenco/thapi-traces/thapi_aggreg--2026-04-02T20:04:20+00:00
BACKEND_HIP,BACKEND_ZE | 1 Hostnames | 1 Processes | 1 Threads |

                           Name |     Time | Time(%) | Calls |  Average |      Min |      Max |
       __hipUnregisterFatBinary | 100.53ms |  48.46% |     1 | 100.53ms | 100.53ms | 100.53ms |
         __hipRegisterFatBinary |  99.44ms |  47.94% |     1 |  99.44ms |  99.44ms |  99.44ms |
                hipLaunchKernel |   2.80ms |   1.35% |     1 |   2.80ms |   2.80ms |   2.80ms |
                      hipMemcpy |   1.64ms |   0.79% |     3 | 547.44us | 254.69us |   1.05ms |
  zeCommandListAppendMemoryCopy |   1.08ms |   0.52% |     4 | 269.04us |  21.76us | 734.88us |
         zeEventHostSynchronize | 428.63us |   0.21% |     7 |  61.23us |    276ns | 150.52us |
           zeCommandQueueCreate | 251.22us |   0.12% |     1 | 251.22us | 251.22us | 251.22us |
                 zeModuleCreate | 241.10us |   0.12% |     1 | 241.10us | 241.10us | 241.10us |
                      zeMemFree | 155.59us |   0.08% |     5 |  31.12us |  19.55us |  44.30us |
zeCommandListAppendLaunchKernel | 115.94us |   0.06% |     6 |  19.32us |   7.38us |  75.47us |
                      hipMalloc | 100.29us |   0.05% |     2 |  50.15us |  27.42us |  72.88us |
    zeContextMakeMemoryResident |  89.52us |   0.04% |     5 |  17.90us |   9.63us |  36.66us |
               zeMemAllocShared |  85.85us |   0.04% |     1 |  85.85us |  85.85us |  85.85us |
              zeEventPoolCreate |  83.61us |   0.04% |     1 |  83.61us |  83.61us |  83.61us |
             zeEventPoolDestroy |  80.18us |   0.04% |     1 |  80.18us |  80.18us |  80.18us |
                        hipFree |  69.24us |   0.03% |     2 |  34.62us |  23.14us |  46.10us |
   zeCommandListCreateImmediate |  52.84us |   0.03% |     1 |  52.84us |  52.84us |  52.84us |
               zeMemAllocDevice |  47.52us |   0.02% |     5 |   9.50us |   4.36us |  16.54us |
                zeModuleDestroy |  40.27us |   0.02% |     1 |  40.27us |  40.27us |  40.27us |
                 zeEventDestroy |  18.93us |   0.01% |     6 |   3.15us |    686ns |  14.60us |
               zeEventHostReset |  13.52us |   0.01% |    10 |   1.35us |    224ns |   3.64us |
                 zeKernelCreate |  11.38us |   0.01% |     5 |   2.28us |   1.09us |   5.99us |
                  zeEventCreate |  11.23us |   0.01% |     6 |   1.87us |    372ns |   6.55us |
           zeCommandListDestroy |   9.33us |   0.00% |     1 |   9.33us |   9.33us |   9.33us |
                    zeDeviceGet |   9.21us |   0.00% |     2 |   4.61us |   1.86us |   7.35us |
     __hipPushCallConfiguration |   6.82us |   0.00% |     1 |   6.82us |   6.82us |   6.82us |
      zeCommandQueueSynchronize |   5.68us |   0.00% |    10 | 568.10ns |    211ns |   1.96us |
          __hipRegisterFunction |   4.42us |   0.00% |     1 |   4.42us |   4.42us |   4.42us |
                zeKernelDestroy |   3.75us |   0.00% |     5 | 750.60ns |    300ns |   2.09us |
              zeContextCreateEx |   3.17us |   0.00% |     1 |   3.17us |   3.17us |   3.17us |
               __hipRegisterVar |   3.17us |   0.00% |     2 |   1.58us |    601ns |   2.56us |
      zeKernelSetIndirectAccess |   2.87us |   0.00% |     6 | 477.50ns |    135ns |   1.58us |
           zeKernelSetGroupSize |   1.94us |   0.00% |     6 | 323.17ns |    159ns |    710ns |
               zeContextDestroy |   1.41us |   0.00% |     1 |   1.41us |   1.41us |   1.41us |
         zeModuleGetKernelNames |   1.31us |   0.00% |     2 | 653.00ns |    260ns |   1.05us |
      zeModuleBuildLogGetString |    965ns |   0.00% |     2 | 482.50ns |    317ns |    648ns |
          zeCommandQueueDestroy |    919ns |   0.00% |     1 | 919.00ns |    919ns |    919ns |
                    zeDriverGet |    855ns |   0.00% |     2 | 427.50ns |    261ns |    594ns |
                         zeInit |    802ns |   0.00% |     1 | 802.00ns |    802ns |    802ns |
      __hipPopCallConfiguration |    514ns |   0.00% |     1 | 514.00ns |    514ns |    514ns |
                          Total | 207.44ms | 100.00% |   121 |

Device profiling | 1 Hostnames | 1 Processes | 1 Threads | 1 Devices | 1 Subdevices |

                                 Name |     Time | Time(%) | Calls |  Average |      Min |      Max |
   zeCommandListAppendMemoryCopy(M2D) | 256.48us |  66.89% |     2 | 128.24us | 111.20us | 145.28us |
   zeCommandListAppendMemoryCopy(D2M) |  92.56us |  24.14% |     2 |  46.28us |   5.68us |  86.88us |
__chip_var_bind___chipspv_device_heap |   9.12us |   2.38% |     2 |   4.56us |   4.32us |   4.80us |
                                saxpy |   8.48us |   2.21% |     1 |   8.48us |   8.48us |   8.48us |
__chip_var_init___chipspv_device_heap |   6.72us |   1.75% |     1 |   6.72us |   6.72us |   6.72us |
             __chip_reset_non_symbols |   5.44us |   1.42% |     1 |   5.44us |   5.44us |   5.44us |
__chip_var_info___chipspv_device_heap |   4.64us |   1.21% |     1 |   4.64us |   4.64us |   4.64us |
                                Total | 383.44us | 100.00% |    10 |

Explicit memory traffic (BACKEND_ZE) | 1 Hostnames | 1 Processes | 1 Threads |

                              Name |    Byte | Byte(%) | Calls | Average |    Min |    Max |
       zeContextMakeMemoryResident |  8.39MB |  40.00% |     5 |  1.68MB |     8B | 4.19MB |
zeCommandListAppendMemoryCopy(M2D) |  8.39MB |  40.00% |     2 |  4.19MB | 4.19MB | 4.19MB |
zeCommandListAppendMemoryCopy(D2M) |  4.19MB |  20.00% |     2 |  2.10MB |    24B | 4.19MB |
                             Total | 20.97MB | 100.00% |     9 |
                             ```
                             
                             Look good 

@TApplencourt TApplencourt merged commit fcf39b8 into devel Apr 6, 2026
26 of 27 checks passed
@TApplencourt TApplencourt deleted the h2yaml_hip branch April 6, 2026 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant