-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
I'd love to have a tool to list and delete cached models (that were fetched automatically when using the -hf
option). This would be akin to the ls
and rm
commands in Ollama.
Motivation
A lot of people (myself included) use the -hf
option to automatically fetch models from Hugging Face. This places models in a model cache directory, which can get rather big over time. Each model typically has at least three associated files (manifest, gguf, and etag), and sometimes five (adding mmprog gguf and associated etag). Manually managing files in the cache is a bit cumbersome. It would be nice to have an elegant way to see which models you have cached, how much space they take, and have a single command to delete all cached files associated with a model.
Possible Implementation
I built a Python script to do this for my own convenience: https://gist.github.com/sultanqasim/5b6d9654236e18dea4896d3c9ce2dc1b
The output of the script looks like this:
$ ./llama-cache ls
Name Size (GB) Modified
--------------------------------------------------------------------------------
ibm-granite/granite-4.0-h-small-GGUF:Q4_K_M 18.1 2025-10-02 13:34
unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF:Q4_K_XL 16.5 2025-08-07 00:32
unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF:IQ4_XS 12.7 2025-07-24 12:10
unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:Q4_K_XL 16.5 2025-10-01 17:19
unsloth/Qwen3-14B-GGUF:Q4_K_XL 8.5 2025-08-01 23:03
$ ./llama-cache rm unsloth/Qwen3-14B-GGUF:Q4_K_XL
Deleted: /Users/sultan/Library/Caches/llama.cpp/unsloth_Qwen3-14B-GGUF_Qwen3-14B-UD-Q4_K_XL.gguf
Deleted: /Users/sultan/Library/Caches/llama.cpp/unsloth_Qwen3-14B-GGUF_Qwen3-14B-UD-Q4_K_XL.gguf.json
Deleted: /Users/sultan/Library/Caches/llama.cpp/manifest=unsloth_Qwen3-14B-GGUF=Q4_K_XL.json