Skip to content

feat(llm): add QMD_FORCE_CUDA env var to disable Vulkan offloading#338

Open
JasonOA888 wants to merge 1 commit intotobi:mainfrom
JasonOA888:feat/force-cuda-env
Open

feat(llm): add QMD_FORCE_CUDA env var to disable Vulkan offloading#338
JasonOA888 wants to merge 1 commit intotobi:mainfrom
JasonOA888:feat/force-cuda-env

Conversation

@JasonOA888
Copy link

Problem

On Windows VMs with para-virtualized GPUs (e.g., ExHyperV RTX 4090), QMD may use Vulkan offloading instead of pure CUDA mode even when CUDA is working correctly:

$ qmd status
GPU: vulkan (offloading: yes)

Solution

Add QMD_FORCE_CUDA environment variable to force CUDA and disable Vulkan:

export QMD_FORCE_CUDA=1
qmd query "test"

This sets gpu: "cuda" in getLlama() options, bypassing the auto-detection that might choose Vulkan.

Related

Fixes #278

On Windows VMs with para-virtualized GPUs, QMD may use Vulkan
offloading instead of pure CUDA mode even when CUDA is available.

This adds QMD_FORCE_CUDA env var to force CUDA and disable Vulkan:

  export QMD_FORCE_CUDA=1
  qmd query "test"

Fixes tobi#278
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Add --force-cuda parameter to disable Vulkan offloading

1 participant