Add SeFi-Image pipeline

### Model/Pipeline/Scheduler description

We are proposing the integration of SeFi-Image into Diffusers.

SeFi-Image is a text-to-image model family built with Semantic-First Diffusion. It separates image generation into semantic and texture latent streams, denoising the semantic structure slightly ahead of texture details so the texture stream receives a cleaner structural anchor during generation.

Key characteristics:

- Semantic-first generation: separates semantic and texture latents and advances the semantic denoising stream to improve structural consistency.
- Generation-reconstruction trade-off: combines a compact semantic latent for easier generation with a high-fidelity texture latent for reconstruction detail.
- Multiple model scales and variants: 1B/2B/5B Base checkpoints, 5B RL checkpoint, and 1B/2B/5B Turbo checkpoints.
- Few-step Turbo inference: Turbo checkpoints default to 4 denoising steps with guidance scale 1.0.
- Standard text-to-image usage: prompt-to-image generation at 1024x1024 by default.

The proposed Diffusers integration adds:

- `SeFiTransformer2DModel`
- `SeFiPipeline`
- an original-checkpoint conversion script
- API documentation
- fast model and pipeline tests

The current implementation was validated with the real `SeFi-Image/SeFi-Image-1B-turbo` checkpoint, including conversion, 1024x1024 inference parity against a reference output, and CPU offload smoke testing.

### Open source status

- [x] The model implementation is available.
- [x] The model weights are available (Only relevant if addition is not a scheduler).

### Provide useful links for the implementation

- GitHub repository: https://github.com/jmliu206/SeFi-Image
- Project page: https://jmliu206.github.io/sefi-web/
- Technical report: https://arxiv.org/pdf/2606.22568
- Hugging Face organization / model collection: https://huggingface.co/SeFi-Image
- Example checkpoints:
  - https://huggingface.co/SeFi-Image/SeFi-Image-1B-Base
  - https://huggingface.co/SeFi-Image/SeFi-Image-2B-Base
  - https://huggingface.co/SeFi-Image/SeFi-Image-5B-Base
  - https://huggingface.co/SeFi-Image/SeFi-Image-5B-RL
  - https://huggingface.co/SeFi-Image/SeFi-Image-1B-turbo
  - https://huggingface.co/SeFi-Image/SeFi-Image-2B-turbo
  - https://huggingface.co/SeFi-Image/SeFi-Image-5B-turbo
- Diffusers implementation PR: https://github.com/huggingface/diffusers/pull/14084

### Additional context

This is an official integration proposal from the SeFi-Image authors/maintainers. The draft PR is already open with the implementation and validation results, and can be adjusted based on maintainer feedback on the desired scope or API shape.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add SeFi-Image pipeline #14086

Model/Pipeline/Scheduler description

Open source status

Provide useful links for the implementation

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add SeFi-Image pipeline #14086

Description

Model/Pipeline/Scheduler description

Open source status

Provide useful links for the implementation

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions