GitHub - moxin-org/Moxin-Studio

A native desktop AI app built with pure Rust and Makepad.

Run local LLMs, generate images, clone voices, and transcribe speech — all on your own hardware, without a Python runtime.

Getting Started | Supported Models | Platform

Getting Started

Requirements

macOS 14.0+ (Sonoma) on Apple Silicon (M1-M5)
Rust 1.82+
Xcode Command Line Tools (xcode-select --install)

1. Install OminiX-API

Moxin Studio uses OminiX-API as the local inference server. Install it first:

curl -fsSL https://raw.githubusercontent.com/OminiX-ai/OminiX-API/main/install.sh | sh

This installs ominix-api to /usr/local/bin and creates ~/.OminiX/ for config and models.

2. Build and run Moxin Studio

git clone https://github.com/moxin-org/Moxin-Studio.git
cd Moxin-Studio
cargo run -p moly-shell --bin moxin-studio

The first build takes a few minutes to compile all dependencies. Subsequent runs are fast.

3. Download a model and start chatting

Open the Model Hub from the sidebar, click Download on any model, then click Load. Moxin Studio will auto-start OminiX-API and route your chat through it.

Features

Local AI inference — Run LLMs, vision models, image generation, speech recognition, and TTS directly on your Mac via OminiX-API
Model Hub — Discover, download, and run models directly from the app
Voice I/O — Speech-to-text and text-to-speech with voice cloning
MCP support — Model Context Protocol for tool use
Chat history — Persistent, searchable conversation history

Supported Local Models

Every model below has a dedicated, optimized implementation — not a generic wrapper. The pure Rust models run directly via OminiX-MLX with Metal GPU acceleration.

LLM — Large Language Models

Model	Implementation	Notes
Qwen3	Pure Rust	0.6B, 4B, 8B variants
Qwen3.5-27B	Pure Rust	Hybrid DeltaNet + Attention
GLM-4	Pure Rust
GLM-4.7-Flash	Pure Rust	MoE + MLA architecture
GLM-4.5 MoE	Pure Rust	Mixture of Experts
Mistral / Nemo	Pure Rust
Mixtral	Pure Rust	MoE
MiniCPM-SALA	Pure Rust	Hybrid attention

VLM — Vision Language Models

Model	Implementation
Qwen3-VL	Pure Rust
Moxin-7B	Pure Rust
DeepSeek-OCR-2	Pure Rust

ASR — Speech Recognition

Model	Implementation	Notes
Qwen3-ASR	Pure Rust	30+ languages
Paraformer (FunASR)	Pure Rust
FunASR-Nano	Pure Rust	Lightweight
SenseVoice + Qwen3-4B	Pure Rust	LLM-enhanced ASR

TTS — Text to Speech

Model	Implementation	Notes
Qwen3-TTS	Pure Rust	Preset voices + voice cloning
GPT-SoVITS	Pure Rust	Zero-shot voice cloning
Step-Audio 2	Pure Rust

Image Generation

Model	Implementation	Notes
FLUX.2-klein	Pure Rust	Also available as GGUF
Z-Image-Turbo	Pure Rust
Qwen-Image-2512	Pure Rust
Qwen-Image-Edit-2511	Python MLX	Image editing
Cosmos Predict2 14B	Python MLX	Text-to-image

Video Generation

Model	Implementation	Notes
Wan2.2 5B	Python MLX	Text-to-video

The Moxin / OminiX Platform

Moxin Studio is the user-facing layer of a three-part pure Rust AI platform:

┌─────────────────────────────────────────────┐
│            Moxin Studio (this repo)         │  Desktop UI (Rust + Makepad)
│         Chat · Models · Voice · Settings    │
└──────────────────────┬──────────────────────┘
                       │ OpenAI-compatible REST/WS
┌──────────────────────▼──────────────────────┐
│               OminiX-API                    │  Local inference server (pure Rust)
│    LLM · ASR · TTS · Image endpoints       │
└──────────────────────┬──────────────────────┘
                       │ Rust crate interface
┌──────────────────────▼──────────────────────┐
│               OminiX-MLX                    │  On-device inference backend
│      Metal-accelerated · MLX framework      │  (Apple Silicon)
└─────────────────────────────────────────────┘

OminiX-MLX — Apple Silicon inference engine. Pure-Rust bindings to Apple's MLX framework with Metal GPU acceleration. Supports LLMs, VLMs, ASR, TTS, and image generation.
OminiX-API — Local inference server. OpenAI-compatible HTTP and WebSocket endpoints with dynamic model loading at runtime.
Moxin Studio (this repo) — Desktop application. Connects to OminiX-API for local inference and cloud providers for remote models.

Project Structure

Moxin-Studio/
├── moly-shell/          # Main application binary
├── moly-data/           # Shared state, persistence, API clients
├── moly-widgets/        # Reusable UI components and theming
└── apps/
    ├── moly-chat/       # Chat interface
    ├── moly-hub/        # Model Hub (discovery, download, load/unload)
    ├── moly-settings/   # Provider and API key configuration
    ├── moly-mcp/        # MCP server configuration
    └── moly-voice/      # Voice I/O

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github/workflows		.github/workflows
apps		apps
moly-data		moly-data
moly-shell		moly-shell
moly-widgets		moly-widgets
vendor/moly-kit		vendor/moly-kit
.gitignore		.gitignore
ADDING_MODELS.md		ADDING_MODELS.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
MAKEPAD_WIDGET_IMPORTS.md		MAKEPAD_WIDGET_IMPORTS.md
MLX_MODEL_PLAN.md		MLX_MODEL_PLAN.md
Plan.md		Plan.md
README.md		README.md
REFACTORING_PLAN.md		REFACTORING_PLAN.md
VOICE_PLAN.md		VOICE_PLAN.md
build-and-run.sh		build-and-run.sh
local_models_config.example.json		local_models_config.example.json
moxin-studio-logo-dark.png		moxin-studio-logo-dark.png
moxin-studio-logo.png		moxin-studio-logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting Started

Requirements

1. Install OminiX-API

2. Build and run Moxin Studio

3. Download a model and start chatting

Features

Supported Local Models

LLM — Large Language Models

VLM — Vision Language Models

ASR — Speech Recognition

TTS — Text to Speech

Image Generation

Video Generation

The Moxin / OminiX Platform

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Getting Started

Requirements

1. Install OminiX-API

2. Build and run Moxin Studio

3. Download a model and start chatting

Features

Supported Local Models

LLM — Large Language Models

VLM — Vision Language Models

ASR — Speech Recognition

TTS — Text to Speech

Image Generation

Video Generation

The Moxin / OminiX Platform

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages