Skip to content

feat: use build: "autoAttempt" on getLlama#310

Merged
tobi merged 2 commits intotobi:mainfrom
giladgd:nodeLlamaCppUseBuildAutoAttempt
Mar 7, 2026
Merged

feat: use build: "autoAttempt" on getLlama#310
tobi merged 2 commits intotobi:mainfrom
giladgd:nodeLlamaCppUseBuildAutoAttempt

Conversation

@giladgd
Copy link
Contributor

@giladgd giladgd commented Mar 6, 2026

build: "autoAttempt" works similarly to the current behavior in qmd but is more efficient, faster, and allows the use of NODE_LLAMA_CPP_GPU env var to customize the default gpu used in getLlama when it's unset.

It should fix most of the issues users experience with CUDA not loading since node-llama-cpp's internal backend resolution flow is more detailed and handles many edge cases.
It should also fix issues where CUDA is attempted to be used despite the system not supporting it (the current code doesn’t pass the required parameter to getLlamaGpuTypes which is undefined behavior).

Fixes #307
Fixes #287
Fixes #268
Resolves #293
Resolves #266
Resolves #250
Resolves #246
Resolves #243
Resolves #240
Resolves #237
Partially remedies #222
Fixes #213
Fixes #194
Fixes #185
Fixes #87 (the new node-llama-cpp version properly detects glibc on nixos now)
Fixes #59 (the new node-llama-cpp version contains a workaround for this bug in Bun)

@tobi tobi merged commit f75c668 into tobi:main Mar 7, 2026
@tobi
Copy link
Owner

tobi commented Mar 7, 2026

Thank you so much! really appreciate it @giladgd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment