Skip to content

Popular repositories Loading

  1. rotorquant rotorquant Public

    KV cache compression via block-diagonal rotation. Beats TurboQuant: better PPL (6.91 vs 7.07), 28% faster decode, 5.3x faster prefill, 44x fewer params. Drop-in llama.cpp integration.

    Python 994 85

  2. dLLM-castlehill dLLM-castlehill Public

    Forked from pengzhangzhi/Open-dLLM

    dllm mashup of papers - q) can we get 400 tokens / second on a 5090?

    Python 6

  3. prompt-packs prompt-packs Public

    Official Scrya prompt packs for Episode Interactive-style pixel art character generation

    5 1

  4. chat-ui-dart chat-ui-dart Public

    Dart

Repositories

Showing 4 of 4 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…