ComfyUI + FLUX.1 Schnell

Fast image generation with FLUX.1 Schnell — Black Forest Labs' distilled model that generates high-quality images in just 4 steps, roughly 10x faster than SDXL. Deployed via ComfyUI's workflow-graph runtime.

GPU: 1x A10G or L4 (24GB VRAM) · Cold start: ~120s · API: Native ComfyUI (not OpenAI-compatible)

Prerequisites

Convox rack v3.24.6+ with GPU-capable nodes (g5.xlarge recommended)
A HuggingFace access token (optional — avoids download rate limits; FLUX.1 Schnell is Apache 2.0, not gated)

Five-Minute Quickstart

git clone https://github.com/convox-examples/inference-examples.git
cd inference-examples/flux-schnell

convox apps create flux
convox env set HUGGING_FACE_HUB_TOKEN=hf_your_token_here -a flux
convox deploy -a flux

The first deploy builds the Docker image and downloads FLUX.1 Schnell weights (~12GB). Subsequent deploys use cached layers.

Get the Endpoint

convox services -a flux

SERVICE  DOMAIN                                  PORTS
api      api.flux.org-abc123.convox.cloud         443:8188

Test It

Queue a generation (API format):

ENDPOINT=$(convox services -a flux | awk '$1 == "api" {print $2}')

# Queue a FLUX generation (4 steps — fast!)
PROMPT_ID=$(jq '{prompt: .}' workflow-api.json | \
  curl -s "https://$ENDPOINT/prompt" \
  -H "Content-Type: application/json" \
  -d @- | jq -r '.prompt_id')

echo "Queued: $PROMPT_ID"

# Poll for completion (~5-10s for FLUX Schnell at 1024x1024)
sleep 10
curl -s "https://$ENDPOINT/history/$PROMPT_ID" | jq '.["'"$PROMPT_ID"'"].status'

Custom prompt:

# Modify the workflow text and submit
jq '.["6"].inputs.text = "a cyberpunk cityscape at night, neon lights, rain" | {prompt: .}' workflow-api.json | \
  curl -s "https://$ENDPOINT/prompt" \
    -H "Content-Type: application/json" \
    -d @- | jq .

Browse the UI:

Open https://<endpoint>/ in your browser to access the full ComfyUI graph editor.

API Endpoints

Endpoint	Method	Purpose
`/prompt`	POST	Queue a workflow for execution
`/history/{prompt_id}`	GET	Check generation status + get outputs
`/view`	GET	Retrieve generated images
`/system_stats`	GET	GPU utilization and queue info
`/`	GET	ComfyUI web interface

FLUX.1 Schnell vs SDXL

	FLUX.1 Schnell	SDXL 1.0
Steps	4	25-30
Time per image (A10G)	~5s	~30s
Parameters	12B	3.5B
VRAM	~20GB	~12GB
Quality	High	High
License	Apache 2.0	Open

FLUX Schnell's distillation means 4-step inference produces quality comparable to SDXL at 25 steps.

Workflow Format

ComfyUI uses a node-graph JSON format. Export from the ComfyUI desktop app using "Save (API Format)" — this is different from the standard save format. An example workflow-api.json is included in this directory.

The included workflow uses:

UNETLoader for FLUX Schnell weights (FP8 quantized)
DualCLIPLoader for CLIP-L + T5-XXL text encoders
VAELoader for the FLUX autoencoder
4-step Euler sampling with cfg=1.0 (FLUX Schnell is guidance-distilled)

AWS Instance Sizing

Instance	GPU	VRAM	Use Case
`g5.xlarge`	1x A10G	24 GB	Default — FLUX needs ~20GB VRAM
`g6.xlarge`	1x L4	24 GB	Alternative

FLUX.1 Schnell requires ~20GB VRAM for the 12B parameter model. A T4 (16GB) is not sufficient.

Scaling

Default config keeps one replica warm with up to 2 replicas. FLUX generates images in ~5s, so a single replica handles moderate throughput.

Budget Controls

convox budget set flux --monthly-cap-usd 100 --at-cap-action alert-only
convox cost -a flux

GPU Observability

Enable GPU telemetry in Rack Settings to surface per-app GPU utilization, memory, temperature, and inference throughput in the Console GPU Dashboard.

Troubleshooting

Symptom	Cause	Fix
`429` during build	HuggingFace download rate limit	Set `HUGGING_FACE_HUB_TOKEN` to avoid throttling (model is public)
OOM during generation	FLUX at 1024x1024 uses ~20GB	Use `g5.xlarge` (24GB); T4 is too small
Workflow JSON rejected	Exported in UI format, not API format	Re-export using Save (API Format) in ComfyUI
Slow first generation	Model loading into GPU on first prompt	Normal; subsequent generations are fast (~5s)

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Dockerfile		Dockerfile
README.md		README.md
convox.yml		convox.yml
test.sh		test.sh
workflow-api.json		workflow-api.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ComfyUI + FLUX.1 Schnell

Prerequisites

Five-Minute Quickstart

Get the Endpoint

Test It

API Endpoints

FLUX.1 Schnell vs SDXL

Workflow Format

AWS Instance Sizing

Scaling

Budget Controls

GPU Observability

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

ComfyUI + FLUX.1 Schnell

Prerequisites

Five-Minute Quickstart

Get the Endpoint

Test It

API Endpoints

FLUX.1 Schnell vs SDXL

Workflow Format

AWS Instance Sizing

Scaling

Budget Controls

GPU Observability

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages