Releases · CodeLinaro/llama.cpp

04 Dec 00:00

cc98896

b4255

vulkan: optimize and reenable split_k (#10637)

Use vector loads when possible in mul_mat_split_k_reduce. Use split_k
when there aren't enough workgroups to fill the shaders.

Assets 22

03 Dec 00:51

github-actions

b4242

642330a

b4242

llama : add enum for built-in chat templates (#10623)

* llama : add enum for supported chat templates

* use "built-in" instead of "supported"

* arg: print list of built-in templates

* fix test

* update server README

Assets 22

30 Nov 00:13

github-actions

b4226

7cc2d2c

b4226

ggml : move AMX to the CPU backend (#10570)

* ggml : move AMX to the CPU backend

---------

Co-authored-by: Georgi Gerganov <[email protected]>

Assets 22

29 Nov 19:42

github-actions

b4224

3a8e9af

b4224

imatrix : support combine-only (#10492)

* imatrix-combine-only idea

* ensured that behavior consistent with log

Assets 22

28 Nov 20:28

github-actions

b4215

dc22344

b4215

ggml : remove redundant copyright notice + update authors

Assets 22

28 Nov 00:22

github-actions

b4202

9f91251

b4202

common : fix duplicated file name with hf_repo and hf_file (#10550)

Assets 22

26 Nov 23:28

github-actions

b4191

c9b00a7

b4191

ci : fix cuda releases (#10532)

Assets 22

26 Nov 04:04

github-actions

b4174

0eb4e12

b4174

vulkan: Fix a vulkan-shaders-gen arugment parsing error (#10484)

The vulkan-shaders-gen was not parsing the --no-clean argument correctly.
Because the previous code was parsing the arguments which have a value only
and the --no-clean argument does not have a value, it was not being parsed
correctly. This commit can now correctly parse arguments that don't have values.

Assets 21

25 Nov 23:57

github-actions

b4173

0cc6375

b4173

Introduce llama-run (#10291)

It's like simple-chat but it uses smart pointers to avoid manual
memory cleanups. Less memory leaks in the code now. Avoid printing
multiple dots. Split code into smaller functions. Uses no exception
handling.

Signed-off-by: Eric Curtin <[email protected]>

Assets 21

25 Nov 21:25

github-actions

b4170

47f931c

b4170

server : enable cache_prompt by default (#10501)

ggml-ci

Assets 21

Releases: CodeLinaro/llama.cpp

b4255

Uh oh!

b4242

Uh oh!

b4226

Uh oh!

b4224

Uh oh!

b4215

Uh oh!

b4202

Uh oh!

b4191

Uh oh!

b4174

Uh oh!

b4173

Uh oh!

b4170

Uh oh!