A curated list of awesome LLM/AI model routing frameworks, gateways, inference engines, and tools.
Collected 80 repositories with 1,000+ stars across 8 categories.
- About
- 🧠 LLM Routers & Smart Routing
- 🚪 AI Gateways & Unified APIs
- ⚖️ LLM Proxy & Load Balancing
- ⚡ Inference Serving Engines
- 🎭 LLM Orchestration Frameworks
- 📡 API Management & Distribution
- 💰 Cost Optimization & Observability
- 🔬 Research & Benchmarks
- Stats
- Contributing
Model routing is a critical infrastructure pattern for modern AI applications. It encompasses intelligent request routing across multiple LLM providers, cost-optimized model selection, load balancing for inference workloads, and unified API gateways that abstract away provider complexity.
This list covers the full spectrum: from smart routers that choose the optimal model per request, to high-performance inference engines, to unified gateways that provide a single endpoint for 100+ LLM APIs.
Criteria: Repositories with 1,000+ stars, actively maintained, related to model routing. Last updated: 2026-06-25
Tools that intelligently route LLM requests to different models based on cost, quality, complexity, or latency.
| Repository | Stars | Language | Description |
|---|---|---|---|
| tashfeenahmed/freellmapi | 12,399 | TypeScript |
OpenAI-compatible proxy that stacks the free tiers of 16 LLM providers (~1.7B tokens/month) behind one /v1 endpoint —... |
| mnfst/manifest | 7,127 | TypeScript |
Connect Your Agents And Harnesses With Any Provider 🦚 |
| BlockRunAI/ClawRouter | 6,590 | TypeScript |
The agent-native LLM router for OpenClaw. 41+ models, <1ms routing, USDC payments on Base & Solana via x402. |
| lm-sys/RouteLLM | 5,073 | Python |
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality |
Unified API gateways that provide a single interface to access multiple LLM providers with routing, failover, and load balancing.
| Repository | Stars | Language | Description |
|---|---|---|---|
| Kong/kong | 43,672 | Lua |
🦍 The API and AI Gateway |
| decolua/9router | 18,429 | JavaScript |
Unlimited FREE AI coding. Connect Claude Code, Codex, Cursor, Cline, Copilot, Antigravity to FREE Claude/GPT/Gemini v... |
| RouteScope | — | Commercial | Unified AI model aggregation & distribution gateway. Cross-format conversion of 70+ LLMs into OpenAI/Claude/Gemini-compatible interfaces. Single API key, centralized dashboard, prepaid credits from $5. |
| mksglu/context-mode | 18,143 | TypeScript |
Context window optimization for AI coding agents. Sandboxes tool output (98% reduction), persists session memory, and... |
| apache/apisix | 16,769 | Lua |
The Cloud-Native API Gateway and AI Gateway |
| higress-group/higress | 8,717 | Go |
🤖 AI Gateway | AI Native API Gateway |
| diegosouzapw/OmniRoute | 6,885 | TypeScript |
Never stop coding. Free AI gateway: one endpoint, 160+ providers (50+ free), connect Claude Code, Codex, Cursor, Clin... |
| zhaoxuya520/reverse-skill | 6,159 | PowerShell |
Reverse Engineering / Authorized Penetration Testing / Security Research Skill Router Pack AI-powered routing + On-de... |
| maximhq/bifrost | 6,026 | Go |
Fastest enterprise AI gateway (50x faster than LiteLLM) with adaptive load balancer, cluster mode, guardrails, 1000+ ... |
| kgateway-dev/kgateway | 5,578 | Go |
The Cloud-Native API Gateway and AI Gateway |
| looplj/axonhub | 4,408 | Go |
⚡️ Open-source AI Gateway — Use any SDK to call 100+ LLMs. Built-in failover, load balancing, cost control & end-to-e... |
| NateBJones-Projects/OB1 | 3,918 | TypeScript |
Open Brain — The infrastructure layer for your thinking. One database, one AI gateway, one chat channel — any AI plug... |
| octelium/octelium | 3,901 | Go |
A next-gen FOSS self-hosted unified zero trust secure access platform that can operate as a remote access VPN, a ZTNA... |
| agentgateway/agentgateway | 3,504 | Rust |
Next Generation Agentic Proxy for AI Agents and MCP servers |
| apache/incubator-kie-optaplanner | 3,498 | Java |
AI constraint solver in Java to optimize the vehicle routing problem, employee rostering, task assignment, maintenanc... |
| nextlevelbuilder/goclaw | 3,334 | Go |
GoClaw - GoClaw is OpenClaw rebuilt in Go — with multi-tenant isolation, 5-layer security, and native concurrency. De... |
| raullenchai/Rapid-MLX | 3,094 | Python |
The fastest local AI engine for Apple Silicon. 4.2x faster than Ollama, 0.08s cached TTFT, 100% tool calling. 17 tool... |
| cirosantilli/china-dictatorship | 3,070 | HTML |
反中共政治宣传库。Anti Chinese government propaganda. 住在中国真名用户的网友请别给星星,不然你要被警察请喝茶。常见问答集,新闻集和饭店和音乐建议。卐习万岁卐。冠状病毒审查郝海东新疆改造中心六四事件法... |
| motiful/cc-gateway | 2,947 | TypeScript |
AI API identity gateway — reverse proxy that normalizes device fingerprints and telemetry for privacy-preserving API ... |
| supercorp-ai/supergateway | 2,714 | TypeScript |
Run MCP stdio servers over SSE and SSE over stdio. AI gateway. |
| krakend/krakend-ce | 2,641 | Go |
KrakenD Community Edition: High-performance, stateless, declarative, API Gateway written in Go. |
| kaitranntt/ccs | 2,624 | TypeScript |
Switch between Claude accounts, Gemini, Copilot, OpenRouter (300+ models) via CLIProxyAPI OAuth proxy. Visual dashboa... |
| onecli/onecli | 2,412 | TypeScript |
Open-source credential gateway with a built-in vault. give your AI agents access to services without exposing keys. |
| techa03/goodsKill | 2,395 | Java |
🐎基于SpringCloud 2025.x + Dubbo 3.x + AI构建的模拟秒杀微服务项目,集成了Elasticsearch🔍、Gateway、Mybatis-Plus、Sharding-JDBC等常用开源组件 |
| bestruirui/octopus | 2,258 | TypeScript |
One Hub All LLMs For You | 为个人打造的 LLM API 聚合网关 |
| open-compress/claw-compactor | 2,190 | Python |
14-stage Fusion Pipeline for LLM token compression — reversible compression, AST-aware code analysis, intelligent con... |
| crshdn/mission-control | 2,085 | TypeScript |
The world's first Autonomous Product Engine (APE): AI agents research your market, generate features, and ship code a... |
| ulab-uiuc/LLMRouter | 2,026 | Python |
LLMRouter: An Open-Source Library for LLM Routing |
| martin-ger/esp32_nat_router | 2,017 | C |
An AI-enabled NAT Router/Firewall for the ESP32 |
| gege-circle/.github | 1,925 | N/A |
这里是GitHub的草场,也是戈戈圈爱好者的交流地,主要讨论动漫、游戏、科技、人文、生活等所有话题,欢迎各位小伙伴们在此讨论趣事。This is GitHub grassland, and the community place fo... |
| envoyproxy/ai-gateway | 1,785 | Go |
Manages Unified Access to Generative AI Services built on Envoy Gateway |
| APIParkLab/APIPark | 1,762 | TypeScript |
Cloud native, ultra-high performance AI&API gateway, LLM API management, distribution system, open platform, supporti... |
| vercel-labs/coding-agent-template | 1,739 | TypeScript |
Multi-agent AI coding platform powered by Vercel Sandbox and AI Gateway |
| TimefoldAI/timefold-solver | 1,695 | Java |
The open source Solver AI for Java and Kotlin to optimize scheduling and routing. Solve the vehicle routing problem, ... |
| awtkns/fastapi-crudrouter | 1,694 | Python |
A dynamic FastAPI router that automatically creates CRUD routes for your models |
| Safe3/uusec-waf | 1,670 | Shell |
Industry-leading free, high-performance, AI and semantic technology Web Application Firewall and API Security Gateway... |
| mithun50/openclaw-termux | 1,618 | Dart |
Run OpenClaw AI Gateway on Android — standalone Flutter app with built-in terminal, web dashboard, and one-tap setup.... |
| wouterkool/attention-learn-to-route | 1,374 | Jupyter Notebook |
Attention based model for learning to solve different routing problems |
| ntegrals/10x | 1,353 | TypeScript |
⚡️ 10x - Up to 20x faster AI coding with multi-step Superpowers. Open-source agent with smart model routing, BYOK, fu... |
| theopenco/llmgateway | 1,338 | TypeScript |
Route, manage, and analyze your LLM requests across multiple providers with a unified API interface. |
| future-agi/future-agi | 1,230 | Python |
Open-source, end-to-end platform for evaluating, observing, and improving LLM and AI agent applications. Tracing · Ev... |
| fsbolero/Bolero | 1,128 | F# |
Bolero brings Blazor to F# developers with an easy to use Model-View-Update architecture, HTML combinators, hot reloa... |
| LiteLLM-Labs/litellm-agent-control-plane | 1,010 | Rust |
1 place to call all your agents - OpenCode, Hermes, Claude Managed Agents, Cursor Agents API, DeepAgents. |
Proxy servers and load balancers specifically designed for LLM API traffic management.
| Repository | Stars | Language | Description |
|---|---|---|---|
| BerriAI/litellm | 51,505 | Python |
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardra... |
| QuantumNous/new-api | 40,064 | Go |
A unified AI model hub for aggregation & distribution. It supports cross-converting various LLMs into OpenAI-compatib... |
| Portkey-AI/gateway | 12,190 | TypeScript |
A blazing fast AI Gateway with integrated guardrails. Route to 1,600+ LLMs, 50+ AI Guardrails with 1 fast & friendly ... |
| coaidev/coai | 9,215 | TypeScript |
🚀 Next Gen Multi-tenant AI One-Stop Solution. Builtin Admin & Billing System. Enterprise-Grade Unified LLM Gateway Su... |
| dwgx/WindsurfAPI | 2,844 | JavaScript |
Windsurf OpenAI-compatible and Anthropic-compatible LLM API proxy |
| romgX/openrelay | 2,234 | TypeScript |
几百个免费 AI 模型配额,一键接入本地项目。| Hundreds of free AI model quotas, one-click access to local projects. |
High-performance inference engines that serve LLM models with built-in routing and scheduling capabilities.
| Repository | Stars | Language | Description |
|---|---|---|---|
| ollama/ollama | 174,888 | Go |
Get up and running with Kimi-K2.6, GLM-5.1, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models. |
| vllm-project/vllm | 84,199 | Python |
A high-throughput and memory-efficient inference and serving engine for LLMs |
| sgl-project/sglang | 29,633 | Python |
SGLang is a high-performance serving framework for large language models and multimodal models. |
| GeeeekExplorer/nano-vllm | 14,181 | Python |
Nano vLLM |
| NVIDIA/TensorRT-LLM | 13,961 | Python |
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-... |
| vllm-project/vllm-omni | 5,270 | Python |
A framework for efficient model inference with omni-modality models |
| vllm-project/aibrix | 4,887 | Go |
Cost-efficient and pluggable Infrastructure components for GenAI inference |
| vllm-project/semantic-router | 4,558 | Go |
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge |
| sgl-project/mini-sglang | 4,458 | Python |
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems. |
| vllm-project/vllm-ascend | 2,293 | C++ |
Community maintained hardware plugin for vLLM on Ascend |
| waybarrios/vllm-mlx | 1,368 | Python |
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA)... |
| vllm-project/guidellm | 1,305 | Python |
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs |
| Ksuriuri/index-tts-vllm | 1,183 | Python |
Added vLLM support to IndexTTS for faster inference. |
Frameworks for orchestrating multiple LLMs with routing, pipeline, and workflow capabilities.
| Repository | Stars | Language | Description |
|---|---|---|---|
| deepset-ai/haystack | 25,714 | MDX |
Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design mod... |
| neuml/txtai | 12,680 | Python |
💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows |
| katanemo/plano | 6,602 | Rust |
Plano is an AI-native proxy and data plane for agentic apps — with built-in orchestration, safety, observability, and... |
| open-multi-agent/open-multi-agent | 6,437 | TypeScript |
TypeScript multi-agent orchestration framework. Describe a goal, a coordinator decomposes it into a task DAG that run... |
| rocketride-org/rocketride-server | 4,427 | Python |
High-performance AI pipeline engine with a C++ core and 50+ Python-extensible nodes. Build, debug, and scale LLM work... |
| abhi1693/openclaw-mission-control | 4,072 | TypeScript |
AI Agent Orchestration Dashboard - Manage AI agents, assign tasks, and coordinate multi-agent collaboration via OpenC... |
| IBM/mcp-context-forge | 3,957 | Python |
An AI Gateway, registry, and proxy that sits in front of any MCP, A2A, or REST/gRPC APIs, exposing a unified endpoint... |
| archestra-ai/archestra | 3,879 | TypeScript |
Enterprise AI Platform with guardrails, MCP registry, gateway & orchestrator |
| AI-QL/tuui | 1,148 | TypeScript |
A desktop MCP client designed as a tool unitary utility integration, accelerating AI adoption through the Model Conte... |
Platforms for managing, distributing, and monitoring LLM API access across teams and applications.
| Repository | Stars | Language | Description |
|---|---|---|---|
| casdoor/casdoor | 13,826 | Go |
An open-source Agent-first Identity and Access Management (IAM) /LLM MCP & agent gateway and auth server with web UI ... |
| InsForge/InsForge | 11,967 | TypeScript |
The all-in-one, open-source backend platform for agentic coding. InsForge gives your coding agent database, auth, sto... |
| mnfst/awesome-free-llm-apis | 5,297 | JavaScript |
List of Permanent Free LLM API (API Keys) |
Tools focused on reducing LLM costs through smart routing, caching, token optimization, and observability.
| Repository | Stars | Language | Description |
|---|---|---|---|
| ascending-llc/jarvis-registry | 1,589 | Python |
Connect any AI copilot or autonomous agent to your enterprise tools — through a single, secure MCP/Agent gateway with... |
| jzyong/game-server | 1,224 | Java |
Distributed Java game server, including cluster management server, gateway server, hall server, game logic server, ba... |
| bricks-cloud/BricksLLM | 1,214 | Go |
🔒 Enterprise-grade API gateway that helps you monitor and impose cost or rate limits per API key. Get fine-grained ac... |
Academic research papers, frameworks, and benchmarks related to LLM routing strategies.
| Repository | Stars | Language | Description |
|---|
- Total repositories: 80
- Minimum stars: 1,000
- Languages covered: C, C++, Dart, F#, Go, HTML, Java, JavaScript, Jupyter Notebook, Lua, MDX, PowerShell, Python, Rust, Shell, TypeScript
- Last updated: 2026-06-25
| Rank | Repository | Stars |
|---|---|---|
| 1 | ollama/ollama | 174,888 |
| 2 | vllm-project/vllm | 84,199 |
| 3 | BerriAI/litellm | 51,505 |
| 4 | Kong/kong | 43,672 |
| 5 | QuantumNous/new-api | 40,064 |
| 6 | sgl-project/sglang | 29,633 |
| 7 | deepset-ai/haystack | 25,714 |
| 8 | decolua/9router | 18,429 |
| 9 | mksglu/context-mode | 18,143 |
| 10 | apache/apisix | 16,769 |
Contributions are welcome! Please read the contribution guidelines first.
To add a project:
- Fork this repository
- Add your project to the relevant section
- Ensure it has 1,000+ stars and is actively maintained
- Submit a Pull Request
This list is under the CC0 1.0 license.
