ilyaters

ilyat ilyaters

Highlights

vLLM vLLM Public

Python
vlllms vlllms Public

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python
nvidia-models-macos-app nvidia-models-macos-app Public

Native macOS application for accessing all hosted LLM at NVIDIA via API

Swift
tokenspeed tokenspeed Public

Forked from lightseekorg/tokenspeed

TokenSpeed is a speed-of-light LLM inference engine.

Python
LMCache LMCache Public

Forked from LMCache/LMCache

LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

Python