Skip to content

convox-examples/flux-schnell

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ComfyUI + FLUX.1 Schnell

Fast image generation with FLUX.1 Schnell — Black Forest Labs' distilled model that generates high-quality images in just 4 steps, roughly 10x faster than SDXL. Deployed via ComfyUI's workflow-graph runtime.

GPU: 1x A10G or L4 (24GB VRAM) · Cold start: ~120s · API: Native ComfyUI (not OpenAI-compatible)

Prerequisites

  • Convox rack v3.24.6+ with GPU-capable nodes (g5.xlarge recommended)
  • A HuggingFace access token (optional — avoids download rate limits; FLUX.1 Schnell is Apache 2.0, not gated)

Five-Minute Quickstart

git clone https://github.com/convox-examples/inference-examples.git
cd inference-examples/flux-schnell

convox apps create flux
convox env set HUGGING_FACE_HUB_TOKEN=hf_your_token_here -a flux
convox deploy -a flux

The first deploy builds the Docker image and downloads FLUX.1 Schnell weights (~12GB). Subsequent deploys use cached layers.

Get the Endpoint

convox services -a flux
SERVICE  DOMAIN                                  PORTS
api      api.flux.org-abc123.convox.cloud         443:8188

Test It

Queue a generation (API format):

ENDPOINT=$(convox services -a flux | awk '$1 == "api" {print $2}')

# Queue a FLUX generation (4 steps — fast!)
PROMPT_ID=$(jq '{prompt: .}' workflow-api.json | \
  curl -s "https://$ENDPOINT/prompt" \
  -H "Content-Type: application/json" \
  -d @- | jq -r '.prompt_id')

echo "Queued: $PROMPT_ID"

# Poll for completion (~5-10s for FLUX Schnell at 1024x1024)
sleep 10
curl -s "https://$ENDPOINT/history/$PROMPT_ID" | jq '.["'"$PROMPT_ID"'"].status'

Custom prompt:

# Modify the workflow text and submit
jq '.["6"].inputs.text = "a cyberpunk cityscape at night, neon lights, rain" | {prompt: .}' workflow-api.json | \
  curl -s "https://$ENDPOINT/prompt" \
    -H "Content-Type: application/json" \
    -d @- | jq .

Browse the UI:

Open https://<endpoint>/ in your browser to access the full ComfyUI graph editor.

API Endpoints

Endpoint Method Purpose
/prompt POST Queue a workflow for execution
/history/{prompt_id} GET Check generation status + get outputs
/view GET Retrieve generated images
/system_stats GET GPU utilization and queue info
/ GET ComfyUI web interface

FLUX.1 Schnell vs SDXL

FLUX.1 Schnell SDXL 1.0
Steps 4 25-30
Time per image (A10G) ~5s ~30s
Parameters 12B 3.5B
VRAM ~20GB ~12GB
Quality High High
License Apache 2.0 Open

FLUX Schnell's distillation means 4-step inference produces quality comparable to SDXL at 25 steps.

Workflow Format

ComfyUI uses a node-graph JSON format. Export from the ComfyUI desktop app using "Save (API Format)" — this is different from the standard save format. An example workflow-api.json is included in this directory.

The included workflow uses:

  • UNETLoader for FLUX Schnell weights (FP8 quantized)
  • DualCLIPLoader for CLIP-L + T5-XXL text encoders
  • VAELoader for the FLUX autoencoder
  • 4-step Euler sampling with cfg=1.0 (FLUX Schnell is guidance-distilled)

AWS Instance Sizing

Instance GPU VRAM Use Case
g5.xlarge 1x A10G 24 GB Default — FLUX needs ~20GB VRAM
g6.xlarge 1x L4 24 GB Alternative

FLUX.1 Schnell requires ~20GB VRAM for the 12B parameter model. A T4 (16GB) is not sufficient.

Scaling

Default config keeps one replica warm with up to 2 replicas. FLUX generates images in ~5s, so a single replica handles moderate throughput.

Budget Controls

convox budget set flux --monthly-cap-usd 100 --at-cap-action alert-only
convox cost -a flux

GPU Observability

Enable GPU telemetry in Rack Settings to surface per-app GPU utilization, memory, temperature, and inference throughput in the Console GPU Dashboard.

Troubleshooting

Symptom Cause Fix
429 during build HuggingFace download rate limit Set HUGGING_FACE_HUB_TOKEN to avoid throttling (model is public)
OOM during generation FLUX at 1024x1024 uses ~20GB Use g5.xlarge (24GB); T4 is too small
Workflow JSON rejected Exported in UI format, not API format Re-export using Save (API Format) in ComfyUI
Slow first generation Model loading into GPU on first prompt Normal; subsequent generations are fast (~5s)

About

Deploy FLUX.1 Schnell via Convox CLI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors