Skip to content

feat: add QMD_GPU env var to override GPU backend detection#272

Open
huzaifahkhojani-cyber wants to merge 1 commit intotobi:mainfrom
huzaifahkhojani-cyber:fix/qmd-gpu-env-override
Open

feat: add QMD_GPU env var to override GPU backend detection#272
huzaifahkhojani-cyber wants to merge 1 commit intotobi:mainfrom
huzaifahkhojani-cyber:fix/qmd-gpu-env-override

Conversation

@huzaifahkhojani-cyber
Copy link

Problem

On AMD systems with ROCm installed, node-llama-cpp's getLlamaGpuTypes() reports CUDA as available (via ROCm's CUDA compatibility layer). QMD then tries to build with CUDA, fails because there's no actual CUDA Toolkit, and falls back to CPU mode — which causes a Bun segfault when loading embedding/reranking models into RAM.

This affects AMD GPU systems (tested: Radeon 780M / gfx1103) that have ROCm + Vulkan but no NVIDIA hardware.

Fix

Add a QMD_GPU environment variable that overrides the auto-detection:

export QMD_GPU=vulkan  # Force Vulkan (works great on AMD)
export QMD_GPU=cuda    # Force CUDA
export QMD_GPU=false   # Disable GPU entirely

When unset, behaviour is unchanged (auto-detect: CUDA > Metal > Vulkan > CPU).

Changes

  • src/llm.ts: Read QMD_GPU env var before falling through to auto-detection
  • README.md: Document the new env var under Model Configuration

Testing

Tested on:

  • AMD Ryzen 7 255 + Radeon 780M (gfx1103)
  • ROCm 1.14 + Vulkan (mesa-vulkan-drivers)
  • Bun 1.3.10, Ubuntu 25.10
  • QMD_GPU=vulkan qmd query "test" → works, GPU-accelerated, no crash

On AMD ROCm systems, CUDA is reported as available by node-llama-cpp's
detection even when no CUDA Toolkit is installed. This causes a failed
build attempt followed by a CPU fallback, which segfaults in Bun when
loading large models.

This adds a QMD_GPU environment variable that allows users to force a
specific GPU backend (cuda, metal, vulkan) or disable GPU entirely
(false), bypassing auto-detection when it gets it wrong.

Example: QMD_GPU=vulkan qmd query 'my search'

Tested on AMD Ryzen 7 255 with Radeon 780M (gfx1103) + ROCm + Vulkan.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant