Build error ROCm GPU

# Prerequisites

Before submitting your issue, please ensure the following:

- [ X] I am running the latest version of PowerInfer. Development is rapid, and as of now, there are no tagged versions.
- [ X] I have carefully read and followed the instructions in the [README.md](https://github.com/SJTU-IPADS/PowerInfer/blob/main/README.md).
- [ X] I [searched using keywords relevant to my issue](https://docs.github.com/en/issues/tracking-your-work-with-issues/filtering-and-searching-issues-and-pull-requests) to make sure that I am creating a new issue that is not already open (or closed).

# Expected Behavior
I expect it to build without type errors

# Current Behavior
Fails to build

Please provide a detailed written description of what PowerInfer did, instead.
Failed to build due to type errors

# Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

* Physical (or virtual) hardware you are using, e.g. for Linux:

`$ llscpu
Architecture:                x86_64
  CPU op-mode(s):            32-bit, 64-bit
  Address sizes:             48 bits physical, 48 bits virtual
  Byte Order:                Little Endian
CPU(s):                      32
  On-line CPU(s) list:       0-31
Vendor ID:                   AuthenticAMD
  BIOS Vendor ID:            Advanced Micro Devices, Inc.
  Model name:                AMD RYZEN AI MAX+ 395 w/ Radeon 8060S
    BIOS Model name:         AMD RYZEN AI MAX+ 395 w/ Radeon 8060S           Unknown CPU @ 3.0GHz
    BIOS CPU family:         107
    CPU family:              26
    Model:                   112
    Thread(s) per core:      2
    Core(s) per socket:      16
    Socket(s):               1
    Stepping:                0
    Frequency boost:         enabled
    CPU(s) scaling MHz:      72%
    CPU max MHz:             5187.5000
    CPU min MHz:             625.0000


* Operating System, e.g. for Linux:

`$ uname -a`
uname -a
Linux gtkAMDStrix 6.17.0-19-generic #19~24.04.2-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar  6 23:08:46 UTC 2 x86_64 x86_64 x86_64 GNU/Linux


* SDK version, e.g. for Linux:

```
$ python3 --version
python3 --version
Python 3.12.3

$ make --version
make --version
GNU Make 4.3
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

$ g++ --version
g++ --version
g++ (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 ldd --version
ldd (Ubuntu GLIBC 2.39-0ubuntu8.7) 2.39
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.


```

# Failure Information (for bugs)

Please help provide information about the failure / bug.

# Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

0. python-mvenv powerinfer && source powerinfer/bin/activate
1. git pull
2. run build (I made a sh file to do this
cat make.sh
# Replace '1100' to your card architecture name, you can get it by rocminfo
CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ cmake -S . -B build -DLLAMA_HIPBLAS=on -DAMDGPU_TARGETS=gfx1151
4. source make.sh script
5. Watch it fail due to strong type errors

# Failure Logs

~/PowerInfer# git status
On branch main
Your branch is up to date with 'origin/main'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        make.sh

nothing added to commit but untracked files present (use "git add" to track)
(powerinfer) root@gtkAMDStrix:~/PowerInfer# cat make.sh
# Replace '1100' to your card architecture name, you can get it by rocminfo
CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ cmake -S . -B build -DLLAMA_HIPBLAS=on -DAMDGPU_TARGETS=gfx1151
cmake --build build --config Release
(powerinfer) root@gtkAMDStrix:~/PowerInfer# rocminfo | grep gfx
  Name:                    gfx1151
      Name:                    amdgcn-amd-amdhsa--gfx1151
      Name:                    amdgcn-amd-amdhsa--gfx11-generic
(powerinfer) root@gtkAMDStrix:~/PowerInfer# source make.sh
-- The C compiler identification is Clang 20.0.0
-- The CXX compiler identification is Clang 20.0.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
CMake Warning (dev) at /opt/rocm/lib/cmake/hip/hip-config-amd.cmake:70 (message):
  AMDGPU_TARGETS is deprecated.  Please use GPU_TARGETS instead.
Call Stack (most recent call first):
  /opt/rocm/lib/cmake/hip/hip-config.cmake:138 (include)
  CMakeLists.txt:361 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
-- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success
-- HIP and hipBLAS found
AMD LLD 20.0.0 (/longer_pathname_so_that_rpms_can_support_packaging_the_debug_info_for_all_os_profiles/src/llvm-project/llvm 27682a16360e33e37c4f3cc6adf9a620733f8fe1) (compatible with GNU linkers)
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done (0.8s)
-- Generating done (0.1s)
-- Build files have been written to: /root/PowerInfer/build
[  1%] Building CXX object CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
/root/PowerInfer/ggml-cuda.cu:7718:13: error: no matching function for call to 'hipblasGemmEx'
 7718 |             cublasGemmEx(g_cublas_handles[id], CUBLAS_OP_T, CUBLAS_OP_N,
      |             ^~~~~~~~~~~~
/root/PowerInfer/ggml-cuda.cu:31:22: note: expanded from macro 'cublasGemmEx'
   31 | #define cublasGemmEx hipblasGemmEx
      |                      ^~~~~~~~~~~~~
/root/PowerInfer/ggml-cuda.cu:222:32: note: expanded from macro 'CUBLAS_CHECK'
  222 |         cublasStatus_t err_ = (err);                                                    \
      |                                ^~~
/opt/rocm/include/hipblas/hipblas.h:23710:32: note: candidate function not viable: no known conversion from 'hipDataType' to 'hipblasComputeType_t' for 18th argument
 23710 | HIPBLAS_EXPORT hipblasStatus_t hipblasGemmEx(hipblasHandle_t      handle,
       |                                ^
/root/PowerInfer/ggml-cuda.cu:8655:9: error: no matching function for call to 'hipblasGemmStridedBatchedEx'
 8655 |         cublasGemmStridedBatchedEx(g_cublas_handles[id], CUBLAS_OP_T, CUBLAS_OP_N,
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~
/root/PowerInfer/ggml-cuda.cu:33:36: note: expanded from macro 'cublasGemmStridedBatchedEx'
   33 | #define cublasGemmStridedBatchedEx hipblasGemmStridedBatchedEx
      |                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
/root/PowerInfer/ggml-cuda.cu:222:32: note: expanded from macro 'CUBLAS_CHECK'
  222 |         cublasStatus_t err_ = (err);                                                    \
      |                                ^~~
/opt/rocm/include/hipblas/hipblas.h:24072:32: note: candidate function not viable: no known conversion from 'hipDataType' to 'hipblasComputeType_t' for 22nd argument
 24072 | HIPBLAS_EXPORT hipblasStatus_t hipblasGemmStridedBatchedEx(hipblasHandle_t      handle,
       |                                ^
/root/PowerInfer/ggml-cuda.cu:8689:9: error: no matching function for call to 'hipblasGemmBatchedEx'
 8689 |         cublasGemmBatchedEx(g_cublas_handles[id], CUBLAS_OP_T, CUBLAS_OP_N,
      |         ^~~~~~~~~~~~~~~~~~~
/root/PowerInfer/ggml-cuda.cu:32:29: note: expanded from macro 'cublasGemmBatchedEx'
   32 | #define cublasGemmBatchedEx hipblasGemmBatchedEx
      |                             ^~~~~~~~~~~~~~~~~~~~
/root/PowerInfer/ggml-cuda.cu:222:32: note: expanded from macro 'CUBLAS_CHECK'
  222 |         cublasStatus_t err_ = (err);                                                    \
      |                                ^~~
/opt/rocm/include/hipblas/hipblas.h:23881:32: note: candidate function not viable: no known conversion from 'hipDataType' to 'hipblasComputeType_t' for 19th argument
 23881 | HIPBLAS_EXPORT hipblasStatus_t hipblasGemmBatchedEx(hipblasHandle_t      handle,
       |                                ^
3 errors generated when compiling for gfx1151.
gmake[2]: *** [CMakeFiles/ggml-rocm.dir/build.make:76: CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:626: CMakeFiles/ggml-rocm.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2
(powerinfer) root@gtkAMDStrix:~/PowerInfer# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=24.04
DISTRIB_CODENAME=noble
DISTRIB_DESCRIPTION="Ubuntu 24.04.4 LTS"
(powerinfer) root@gtkAMDStrix:~/PowerInfer# neofetch
            .-/+oossssoo+/-.               root@gtkAMDStrix
        `:+ssssssssssssssssss+:`           ----------------
      -+ssssssssssssssssssyyssss+-         OS: Ubuntu 24.04.4 LTS x86_64
    .ossssssssssssssssssdMMMNysssso.       Host: NucBox_EVO-X2 Version 1.0
   /ssssssssssshdmmNNmmyNMMMMhssssss/      Kernel: 6.17.0-19-generic
  +ssssssssshmydMMMMMMMNddddyssssssss+     Uptime: 4 days, 23 hours, 34 mins
 /sssssssshNMMMyhhyyyyhmNMMMNhssssssss/    Packages: 2236 (dpkg), 12 (snap)
.ssssssssdMMMNhsssssssssshNMMMdssssssss.   Shell: bash 5.2.21
+sssshhhyNMMNyssssssssssssyNMMMysssssss+   Theme: Adwaita [GTK3]
ossyNMMMNyMMhsssssssssssssshmmmhssssssso   Icons: Adwaita [GTK3]
ossyNMMMNyMMhsssssssssssssshmmmhssssssso   CPU: AMD RYZEN AI MAX+ 395 w/ Radeon 8060S (32) @ 5.187GHz
+sssshhhyNMMNyssssssssssssyNMMMysssssss+   GPU: AMD ATI c5:00.0 Device 1586
.ssssssssdMMMNhsssssssssshNMMMdssssssss.   Memory: 26598MiB / 127435MiB
 /sssssssshNMMMyhhyyyyhdNMMMNhssssssss/
  +sssssssssdmydMMMMMMMMddddyssssssss+
   /ssssssssssshdmNNNNmyNMMMMhssssss/
    .ossssssssssssssssssdMMMNysssso.
      -+sssssssssssssssssyyyssss+-
        `:+ssssssssssssssssss+:`
            .-/+oossssoo+/-.

(powerinfer) root@gtkAMDStrix:~/PowerInfer# /opt/rocm/llvm/bin/clang++ --version
AMD clang version 20.0.0git (https://github.com/RadeonOpenCompute/llvm-project roc-7.1.1 25444 27682a16360e33e37c4f3cc6adf9a620733f8fe1)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm-7.1.1/lib/llvm/bin
Configuration file: /opt/rocm-7.1.1/lib/llvm/bin/clang++.cfg
(powerinfer) root@gtkAMDStrix:~/PowerInfer# /opt/rocm/llvm/bin/clang --version
AMD clang version 20.0.0git (https://github.com/RadeonOpenCompute/llvm-project roc-7.1.1 25444 27682a16360e33e37c4f3cc6adf9a620733f8fe1)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm-7.1.1/lib/llvm/bin
Configuration file: /opt/rocm-7.1.1/lib/llvm/bin/clang.cfg



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build error ROCm GPU #276

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Replace '1100' to your card architecture name, you can get it by rocminfo

Failure Logs

Replace '1100' to your card architecture name, you can get it by rocminfo

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Build error ROCm GPU #276

Description

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Replace '1100' to your card architecture name, you can get it by rocminfo

Failure Logs

Replace '1100' to your card architecture name, you can get it by rocminfo

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions