fix(llama): respect NODE_LLAMA_CPP_GPU override by Kaylebor · Pull Request #250 · tobi/qmd

Kaylebor · 2026-02-23T15:03:55Z

Problem

On some systems with AMD GPUs (and potentially others), getLlamaGpuTypes() returns an empty array even when Vulkan is available. This causes QMD to fall back to CPU-only mode, which is significantly slower for embedding operations.

Root Cause

While the detection logic looks correct (see src/bindings/utils/getLlamaGpuTypes.ts), it appears to fail on some Linux/Vulkan configurations (tested on AMD 7900 XTX). The exact root cause is unclear - could be path resolution issues or async handling.

Solution

Check for NODE_LLAMA_CPP_GPU environment variable before calling the potentially broken auto-detection. This allows users to explicitly specify their preferred GPU backend (cuda, metal, or vulkan).

Why this variable? Because it is what node-llama-cpp already uses: src/config.ts#L56

Why this approach:

Non-breaking change: If the env var is not set, behavior remains identical to before
Follows conventions: Uses the same naming pattern as other NODE_LLAMA_CPP_* variables used by the library
User control: Allows overriding broken auto-detection without waiting for upstream fixes
Clean workaround: Doesn't require modifying node-llama-cpp internals

Usage:

Force Vulkan on AMD GPUs

export NODE_LLAMA_CPP_GPU=vulkan
qmd status  # Now shows GPU: vulkan with offloading

Testing:

Tested on AMD Radeon RX 7900 XTX (24GB VRAM) with CachyOS (Arch-based)
Before: GPU: none (running on CPU)
After: GPU: vulkan (offloading: yes) with 37GB VRAM detected

References:

node-llama-cpp/src/config.ts:56 - env var definition
node-llama-cpp/src/bindings/utils/getLlamaGpuTypes.ts - detection issue

Kaylebor · 2026-02-25T14:54:28Z

Maybe non-critical, but on my particular setup qmd completely fails to detect my fully-functional AMD GPU (yes, I do have Vulkan installed properly; in fact with the fix above it works well)

fix(llama): respect NODE_LLAMA_CPP_GPU override

dd2371b

giladgd mentioned this pull request Mar 7, 2026

feat: use build: "autoAttempt" on getLlama #310

Merged

tobi closed this Mar 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llama): respect NODE_LLAMA_CPP_GPU override#250

fix(llama): respect NODE_LLAMA_CPP_GPU override#250
Kaylebor wants to merge 1 commit intotobi:mainfrom
Kaylebor:main

Kaylebor commented Feb 23, 2026

Uh oh!

Kaylebor commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Kaylebor commented Feb 23, 2026

Problem

Root Cause

Solution

Why this approach:

Usage:

Force Vulkan on AMD GPUs

Testing:

References:

Uh oh!

Kaylebor commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants