Mixture of Style Experts for Diverse Image Stylization
We recommend using Python 3.10 and PyTorch with CUDA support. To set up the environment:
# Create a new conda environment
conda create -n styleexpert python=3.10
conda activate styleexpert
# Install requirements
pip install -r requirements.txt
StyleExpert uses a Mixture of Experts (MoE) architecture. For the best results on complex semantic styles (like specific brushstrokes or materials), ensure your style reference image clearly showcases those textures. The model uses a pre-trained Style Representation Encoder to guide the router.
python app.py
You can download the base model FLUX.1-Kontext-dev and our StyleExpert adapters directly from Hugging Face:
- Base Model: FLUX.1-Kontext-dev
- StyleExpert LoRA Experts: Hugging Face Link
Alternatively, use the provided script:
bash download_models.sh --token YOUR_HF_TOKEN
This will download these fixed repos into the local default paths used by inference:
HH-LG/StyleExpert->./weights/black-forest-labs/FLUX.1-Kontext-dev->./models/FLUX.1-Kontext-dev/google/siglip-so400m-patch14-384->./models/siglip-so400m-patch14-384/
We provide the StyleExpert-40K dataset, containing 40,000 high-quality content-style-stylized triplets. This dataset is specifically curated to balance color-centric and semantic-centric styles.
python download_dataset.py --token YOUR_HF_TOKEN
# only fetch metadata first
python download_dataset.py --metadata-only --token YOUR_HF_TOKEN
# or
bash download_dataset.sh --token YOUR_HF_TOKENpython infer.py --content_path ./data/content.jpg --style_path ./data/style.jpg
You can directly run inference with the example pairs in assets/examples/:
# Use example_00 pair
./run.sh ./assets/examples/content_00.png ./assets/examples/style_00.png ./outputs/example_00_out.png
# Use example_01 pair
./run.sh ./assets/examples/content_01.png ./assets/examples/style_01.png ./outputs/example_01_out.pngStyleExpert utilizes a two-stage training approach:
- Style Representation Encoder: Trained with InfoNCE loss to learn discriminative style features.
- MoE Fine-tuning: Uses a similarity-aware gating mechanism to route styles to specialized LoRA experts.
If StyleExpert helps your research, please star the repo and cite our work:
@misc{zhu2026mixturestyleexpertsdiverse,
title={Mixture of Style Experts for Diverse Image Stylization},
author={Shihao Zhu and Ziheng Ouyang and Yijia Kang and Qilong Wang and Mi Zhou and Bo Li and Ming-Ming Cheng and Qibin Hou},
year={2026},
eprint={2603.16649},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2603.16649},
}For questions, please open an issue or contact Shihao Zhu.
Licensed under CC BY-NC 4.0 for non-commercial use.

