Tarik Can Ozden*, Ozgur Kara*, Oguzhan Akcin, Kerem Zaman, Shashank Srivastava, Sandeep P. Chinchali, James M. Rehg
* Equal Contribution
TL; DR: DiffVax is a scalable, lightweight, and optimization-free image immunization framework designed to protect images and videos from diffusion-based editing.
Click for the full abstract
Current image immunization defense techniques against diffusion-based editing embed imperceptible noise into target images to disrupt editing models. However, these methods face scalability challenges, as they require time-consuming optimization for each image separately, taking hours for small batches. To address these challenges, we introduce DiffVax, a scalable, lightweight, and optimization-free framework for image immunization, specifically designed to prevent diffusion-based editing. Our approach enables effective generalization to unseen content, reducing computational costs and cutting immunization time from days to milliseconds, achieving a speedup of 250,000×. This is achieved through a loss term that ensures the failure of editing attempts and the imperceptibility of the perturbations. Extensive qualitative and quantitative results demonstrate that our model is scalable, optimization-free, adaptable to various diffusion-based editing tools, robust against counter-attacks, and, for the first time, effectively protects video content from editing.
# Clone the repository
git clone https://github.com/ozdentarikcan/DiffVax.git
cd DiffVax
# Create conda environment (recommended)
conda create -n diffvax python=3.12 -y
conda activate diffvax
# Install dependencies
pip install -r requirements.txtOr install as a package:
pip install -e .The DiffVax dataset is hosted on Hugging Face: ozdentarikcan/DiffVaxDataset
Download it with the provided script:
python scripts/download_dataset.pyThis places the dataset in data/ with the following structure:
data/
├── train/
│ ├── images/ # Training images (512x512 PNG)
│ ├── masks/ # Corresponding masks
│ └── metadata.jsonl # Image-prompt pairs
└── validation/
├── images/
├── masks/
└── metadata.jsonl
The training and demo scripts will also auto-download the dataset on first run if it's not present.
DiffVax/
├── app.py # Gradio web demo
├── src/diffvax/ # Main package
│ ├── attack.py # Stable Diffusion inpainting wrapper
│ ├── model.py # Nested U-Net (UNet++) architecture
│ ├── utils.py # Image I/O, data loading, seeding
│ ├── immunization/
│ │ ├── diffvax_immunization.py # DiffVax immunization (trained NestedUNet)
│ │ ├── photoguard_immunization.py # PhotoGuard baseline (PGD encoder attack)
│ │ └── diffusionguard_immunization.py # DiffusionGuard baseline (PGD noise maximization)
│ └── metrics/ # Image quality metrics
│ ├── base.py, factory.py
│ └── psnr.py, ssim.py, fsim.py, clip_score.py
│
├── scripts/
│ ├── train.py # Train the immunization model
│ ├── demo.py # End-to-end demo with comparison output
│ ├── evaluate.py # Calculate image quality metrics
│ ├── compare_baselines.py # Multi-image baseline comparison figure
│ └── download_dataset.py # Download dataset from Hugging Face
│
├── notebooks/
│ ├── diffvax_demo.ipynb # Interactive demo notebook
│ └── diffvax_comparison.ipynb # Multi-method comparison notebook
│
├── configs/
│ └── train.yml # Training hyperparameters
│
├── checkpoints/
│ └── diffvax_trained.pth # Pre-trained model weights
│
├── requirements.txt
├── pyproject.toml
└── README.md
Launch the interactive web interface:
python app.pyThis starts a Gradio app at http://localhost:7860 where you can upload an image and mask, enter an editing prompt, and see how DiffVax protects the image.
The demo script downloads the dataset (if needed), loads the pre-trained checkpoint, immunizes a validation image, runs the same edit on both original and immunized versions, and saves a side-by-side comparison.
python scripts/demo.pyOptions:
# Use a different validation image and prompt
python scripts/demo.py --image-index 2 --edit-prompt "a watercolor painting"
# Run on a specific GPU
CUDA_VISIBLE_DEVICES=4 python scripts/demo.py
# Headless server (no display)
python scripts/demo.py --no-display --save-dir outputs/my_demoIf the dataset is not found locally, the demo generates a synthetic sample image automatically. Output files are saved to outputs/demo/ by default:
<name>_original.png— the original image<name>_immunized.png— the DiffVax-protected image<name>_edited_original.png— inpainting edit on the original<name>_edited_immunized.png— inpainting edit on the immunized image (edit should be disrupted)<name>_comparison.png— 2x2 comparison grid
You can also provide your own image and mask directly:
python scripts/demo.py --image photo.png --mask mask.png --edit-prompt "a cat sitting"DiffVax includes implementations of two prior-work baselines for side-by-side comparison:
- PhotoGuard (Salman et al., ICML 2023) — PGD encoder attack that forces
VAE.encode(x+δ)toward a target latent (1000 L∞-PGD steps, ε=0.06), plus a diffusion attack that backpropagates through the full inpainting pipeline (200 L2-PGD steps, ε=16, 10 gradient averaging reps). - DiffusionGuard (Li et al., ICLR 2025) — PGD optimization that maximizes the UNet's noise prediction norm at the highest timestep, with contour-based mask augmentation for robustness (800 L∞-PGD steps, ε=16/255).
Unlike these methods, DiffVax uses a trained NestedUNet for single-pass inference (no per-image optimization).
Generate a multi-image comparison figure:
python scripts/compare_baselines.py
python scripts/compare_baselines.py --images 1 5 9 29 33Run the demo with all methods compared:
python scripts/demo.py --run-baselines --no-displayThis produces the standard DiffVax 2x2 grid plus a 1x5 baseline comparison figure:
Original | Edited (No Defense) | Edited (PhotoGuard) | Edited (DiffusionGuard) | Edited (DiffVax)
You can tune PGD steps:
python scripts/demo.py --run-baselines --pg-steps 200 --dg-steps 100Before training, download the DiffVax dataset:
python scripts/download_dataset.pyThis places the dataset in data/ with the structure shown in the Dataset section above. Alternatively, the training script will automatically download the dataset from Hugging Face Hub on first run if it's not present.
python scripts/train.py \
--config configs/train.yml \
--data-dir data \
--output-dir outputsTraining hyperparameters are set in configs/train.yml:
| Parameter | Default | Description |
|---|---|---|
iter_num |
1000000 | Number of training epochs |
learning_rate |
0.00001 | Adam optimizer learning rate |
batch_size |
5 | Training batch size |
alpha |
4 | Weight for perturbation imperceptibility loss |
attack_model_link |
runwayml/stable-diffusion-inpainting |
Diffusion model to defend against |
train_all |
true | Train on all images (false = use image_index_list) |
NestedUNet (UNet++) — a hierarchical encoder-decoder with dense nested skip connections:
- Input: 3-channel RGB image (512 x 512)
- Encoder: 5 levels with filter depths [32, 64, 128, 256, 512]
- Decoder: dense skip connections at every level (not just symmetric pairs)
- Output: 3-channel perturbation map, applied additively to the input image
- Parameters: ~9.2M
The perturbation is applied additively only outside the edit region (masked area), and the resulting image is clamped to the valid pixel range.
- Python 3.9+
- PyTorch 2.0+
- CUDA-capable GPU with 8 GB+ VRAM (16 GB+ recommended for training)
- ~5 GB disk space for the Stable Diffusion Inpainting model (downloaded automatically)
| Notebook | Description |
|---|---|
notebooks/diffvax_demo.ipynb |
Interactive demo: immunize an image and compare edits in a 1x3 grid |
notebooks/diffvax_comparison.ipynb |
Compare DiffVax against PhotoGuard and DiffusionGuard side-by-side |
@inproceedings{ozden2026diffvax,
title={DiffVax: Optimization-Free Image Immunization Against Diffusion-Based Editing},
author={Ozden, Tarik Can and Kara, Ozgur and Akcin, Oguzhan and Zaman, Kerem and Srivastava, Shashank and Chinchali, Sandeep P and Rehg, James M},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
}
MIT License. See LICENSE for details.
