Skip to content

fix(llama): respect NODE_LLAMA_CPP_GPU override#250

Closed
Kaylebor wants to merge 1 commit intotobi:mainfrom
Kaylebor:main
Closed

fix(llama): respect NODE_LLAMA_CPP_GPU override#250
Kaylebor wants to merge 1 commit intotobi:mainfrom
Kaylebor:main

Conversation

@Kaylebor
Copy link

Problem

On some systems with AMD GPUs (and potentially others), getLlamaGpuTypes() returns an empty array even when Vulkan is available. This causes QMD to fall back to CPU-only mode, which is significantly slower for embedding operations.

Root Cause

While the detection logic looks correct (see src/bindings/utils/getLlamaGpuTypes.ts), it appears to fail on some Linux/Vulkan configurations (tested on AMD 7900 XTX). The exact root cause is unclear - could be path resolution issues or async handling.

Solution

Check for NODE_LLAMA_CPP_GPU environment variable before calling the potentially broken auto-detection. This allows users to explicitly specify their preferred GPU backend (cuda, metal, or vulkan).

Why this variable? Because it is what node-llama-cpp already uses: src/config.ts#L56

Why this approach:

  • Non-breaking change: If the env var is not set, behavior remains identical to before
  • Follows conventions: Uses the same naming pattern as other NODE_LLAMA_CPP_* variables used by the library
  • User control: Allows overriding broken auto-detection without waiting for upstream fixes
  • Clean workaround: Doesn't require modifying node-llama-cpp internals

Usage:

Force Vulkan on AMD GPUs

export NODE_LLAMA_CPP_GPU=vulkan
qmd status  # Now shows GPU: vulkan with offloading

Testing:

  • Tested on AMD Radeon RX 7900 XTX (24GB VRAM) with CachyOS (Arch-based)
  • Before: GPU: none (running on CPU)
  • After: GPU: vulkan (offloading: yes) with 37GB VRAM detected

References:

  • node-llama-cpp/src/config.ts:56 - env var definition
  • node-llama-cpp/src/bindings/utils/getLlamaGpuTypes.ts - detection issue

@Kaylebor
Copy link
Author

Maybe non-critical, but on my particular setup qmd completely fails to detect my fully-functional AMD GPU (yes, I do have Vulkan installed properly; in fact with the fix above it works well)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants