Popular repositories Loading
-
rotorquant
rotorquant PublicKV cache compression via block-diagonal rotation. Beats TurboQuant: better PPL (6.91 vs 7.07), 28% faster decode, 5.3x faster prefill, 44x fewer params. Drop-in llama.cpp integration.
-
dLLM-castlehill
dLLM-castlehill PublicForked from pengzhangzhi/Open-dLLM
dllm mashup of papers - q) can we get 400 tokens / second on a 5090?
Python 6
-
prompt-packs
prompt-packs PublicOfficial Scrya prompt packs for Episode Interactive-style pixel art character generation
-
Repositories
- dLLM-castlehill Public Forked from pengzhangzhi/Open-dLLM
dllm mashup of papers - q) can we get 400 tokens / second on a 5090?
scrya-com/dLLM-castlehill’s past year of commit activity - chat-ui-dart Public
scrya-com/chat-ui-dart’s past year of commit activity - rotorquant Public
KV cache compression via block-diagonal rotation. Beats TurboQuant: better PPL (6.91 vs 7.07), 28% faster decode, 5.3x faster prefill, 44x fewer params. Drop-in llama.cpp integration.
scrya-com/rotorquant’s past year of commit activity - prompt-packs Public
Official Scrya prompt packs for Episode Interactive-style pixel art character generation
scrya-com/prompt-packs’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…