Skip to content

ProfSynapse/Synaptic-Tuner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,117 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Synaptic Tuner

An agentic-first toolkit for building custom LLMs. Generate synthetic training data, fine-tune models, evaluate quality, and deploy — all driven by AI agents that know the system end-to-end.

Synaptic Tuner Banner

License: MIT Python 3.10+ CUDA 12+ PyTorch W&B Optional

Unsloth logo
Training is powered by Unsloth — huge thanks to their team.

The Problem

Fine-tuning a local LLM is a multi-step pipeline that most people never finish. You need to generate quality training data, format it correctly, pick the right training method, configure dozens of hyperparameters, evaluate the results, then figure out GGUF quantization and deployment. Each step has its own tools, formats, and gotchas. Most tutorials cover one piece — you're left stitching the rest together yourself.

The Solution

Synaptic Tuner handles the full pipeline in one repo, and it's designed to be operated by AI coding agents. Instead of memorizing CLI flags and YAML schemas, you describe what you want in plain English. Built-in skills give your agent deep knowledge of every component — it generates data, trains models, runs evaluations, and deploys, enforcing best practices at each step.

Agentic-first means:

  • The repo ships with project skills covering the entire workflow
  • Skills use progressive disclosure — your agent loads only what it needs, when it needs it
  • Best practices are encoded as protocols (dry-run before generation, interleave KTO datasets, etc.)
  • You describe intent, your agent handles execution — "train a 7B model on my dataset" just works

Skills are written as Markdown — they work with any AI coding tool that supports project-level instructions. Claude Code is the reference integration, but the knowledge transfers to Cursor, Windsurf, Cline, Roo Code, and others. There's also a full interactive CLI and Colab notebooks if you prefer working without an agent.

Quick Start

Path How
Claude Code (recommended) Open repo in Claude Code and tell it what you want
Interactive CLI ./run.sh (Linux/WSL) or .\run.ps1 (PowerShell)
Beginner (no GPU) Trainers/notebooks/sft_colab_beginner.ipynb in Google Colab

Using with Claude Code

This repo is built to be operated by Claude Code. It has skills covering the entire pipeline — just describe what you want.

Setup: "Set this repo up for me" — Claude checks your platform, runs environment setup, helps create .env with credentials, and verifies with a dry run.

What you can ask:

Task Example Skill
Generate training data "Generate 50 examples of the search scenario" synthetic-data-generation
Write scenarios/rubrics "Write a new scenario for content operations" synthetic-data-generation
Train a model "Train a 7B model on my dataset" fine-tuning
Evaluate a model "Compare my fine-tuned model against the base" evaluation
Upload to HuggingFace "Upload with GGUF quantizations" upload-deployment
Full pipeline "Train, evaluate, and upload if it looks good" All skills

Skills use progressive disclosure — lean SKILL.md files auto-load, detailed reference docs in reference/ load on demand. Best practices are enforced automatically (dry-run before generation, interleave KTO datasets, use merged_16bit for GGUF, etc.).

Using with Other AI Coding Tools

The skills in .agents/skills/ are plain Markdown — they work with any AI coding tool. Most platforms use AGENTS.md as their entrypoint (Claude Code uses CLAUDE.md). Copy the skill files to your platform's rules directory, or use the universal .skills/ folder at your project root:

Platform Where to put skills
Cursor .cursor/rules/ (rename .md.mdc)
Windsurf .windsurf/rules/
Cline .clinerules/
Roo Code .roo/rules/
Amazon Q .amazonq/rules/
JetBrains AI .aiassistant/rules/
Augment .augment/rules/
Kilo Code .kilocode/rules/
Tabnine .tabnine/guidelines/
Zed, Aider, GitHub Copilot, others .skills/ at project root

Most platforms auto-discover Markdown in their rules directory. For tools that use AGENTS.md, point it at the skills folder or reference the skill files directly.

The Pipeline

SynthChat (env-backed data)  →  SFT  →  merge/publish Nexus model  →  KTO  →  env-GRPO  →  Evaluate  →  Upload/Deploy
Stage Tool Key Config
Generate env-backed data python3 -m SynthChat.run generate SynthChat/scenarios/, SynthChat/config/settings.yaml, SynthChat/config/targets_*.json
Project datasets python3 SynthChat/scripts/project_rollout_datasets.py Datasets/environment_rollouts/, Datasets/kto/, Datasets/grpo/
Train SFT python tuner.py trainsft Trainers/sft/configs/config.yaml
Train KTO python tuner.py trainkto Trainers/kto/configs/config.yaml
Train env-GRPO python tuner.py traingrpo Trainers/grpo/configs/env_config.yaml
Cloud env-GRPO python tuner.py cloud-run --job-config Trainers/cloud/jobs/nexus_quark_l25_28_env_grpo.yaml HF Jobs config + Trainers/grpo/configs/env_config.yaml
Evaluate python -m Evaluator.cli --backend lmstudio --model MODEL Evaluator/config/scenarios/
Upload / merge python tuner.py modelops or upload scripts Hugging Face / Nexus model repos

Current GRPO split

  • Trainers/grpo/configs/config.yaml is the older static projected-dataset GRPO path.
  • Trainers/grpo/configs/env_config.yaml is the current environment-backed multi-step GRPO path.
  • Local NVIDIA train -> grpo now routes to the env-backed trainer and will bootstrap the isolated GRPO runtime if it is missing.
  • The canonical alignment flow is documented in .agents/skills/fine-tuning/protocols/environment-backed-alignment-pipeline.md.

Dataset Format

JSONL with conversations array. Tool call structure is fully configurable.

{
  "conversations": [
    {"role": "user", "content": "Create a new folder called Projects"},
    {"role": "assistant", "content": null, "tool_calls": [{"type": "function", "function": {"name": "createFolder", "arguments": "{\"path\": \"/Projects\"}"}}]}
  ],
  "label": true
}
  • SFT: Positive examples only, label ignored
  • KTO: Interleaved true/false labels required
  • GRPO (static): Prompts + ground truth for reward scoring
  • env-GRPO: Canonical SynthChat rollout records with environment config, stage reviews, and replayable prompts

Repository Map

Synaptic-Tuner/
├── SynthChat/              # Synthetic data generation (scenarios, rubrics, config)
├── Trainers/
│   ├── sft/                # SFT training
│   ├── kto/                # KTO training
│   ├── grpo/               # Static GRPO + env-backed GRPO
│   └── notebooks/          # Colab notebooks (beginner + advanced)
├── Evaluator/              # Model evaluation (scenarios, backends, results)
├── Datasets/               # Training data + canonical environment rollouts
├── shared/                 # Shared infra (LLM client, upload, validation, UI)
├── tuner/                  # Unified CLI (used by run.sh)
├── .agents/skills/         # Project skills and protocols
└── CLAUDE.md               # Project-wide dev guide

Environment

# .env in repo root (auto-loaded by CLI)
HF_TOKEN=hf_...                       # HuggingFace (required for uploads)
OPENROUTER_API_KEY=sk-or-...          # OpenRouter (for generation/improvement)
WANDB_API_KEY=...                     # Weights & Biases (optional)
LMSTUDIO_HOST=localhost               # LM Studio host
OLLAMA_HOST=http://localhost:11434    # Ollama endpoint

License

MIT.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors