A curated list of awesome remote sensing visual generative models, papers, datasets, and resources. This repository focuses exclusively on satellite images and excludes aerial images, street-view images, and their applications.
- Survey Papers
- Scene Synthesis (Unconditional/Text-to-Image Generation)
- Controllable & Structural Synthesis
- Image-to-Image Translation & Restoration
- Inpainting & Content Editing
- Multi-temporal & Sequence Generation
- Generative Modeling for Discriminative Tasks
- Datasets
- Evaluation Metrics
- Contributing
- License
Important
The following are related survey papers in remote sensing. However, visual generative models remain under-explored in all existing remote sensing surveys. This repository aims to fill this gap by providing a comprehensive collection of visual generative models specifically for satellite images. We focus exclusively on satellite imagery and exclude aerial images, street-view images, and their applications.
Warning
We do not fully track the latest code, dataset and venue. We may miss it. Please give us feedback if something is not correct.
| Title | Year | Venue | Tags | Link |
|---|---|---|---|---|
| Foundation Models in Remote Sensing: Evolving from Unimodality to Multimodality | 2026 | IEEE MGRS | Foundation Models, Multimodal | DOI |
| From Orbit to Ground: A Comprehensive Review of Multimodal Self-Supervised Learning for Remote Sensing | 2025 | IEEE MGRS | Multimodal, SSL | DOI |
| Remote Sensing Spatiotemporal Vision--Language Models: A Comprehensive Survey | 2025 | IEEE MGRS | VLMs, (Spatio-)Temporal-VLMs | DOI |
| Vision Foundation Models in Remote Sensing: A Survey | 2025 | IEEE MGRS | Vision Foundation Models, SSL | DOI |
| Artificial Intelligence to Advance Earth Observation: A Review of Models, Recent Trends, and Pathways Forward | 2025 | IEEE MGRS | Representation Learning | DOI |
| Vision-Language Modeling Meets Remote Sensing: Models, Datasets, and Perspectives | 2025 | IEEE MGRS | VLMs | DOI |
| Foundation Models for Remote Sensing and Earth Observation: A Survey | 2025 | IEEE MGRS | Vision Foundation Models, VLMs | DOI |
| Regression in Earth Observation: Are Vision--Language Models up to the Challenge? | 2025 | IEEE MGRS | VLMs | DOI |
| Deep Learning Based Domain Adaptation Methods in Remote Sensing: A Comprehensive Survey | 2025 | arXiv | Domain Adaptation | DOI |
| Vision-Language Models in Remote Sensing: Current Progress and Future Trends | 2024 | IEEE MGRS | VLMs | DOI |
| Title | Year | Venue | Paper Link | Code Link |
|---|---|---|---|---|
| HuiYanEarth-SAR: A Foundation Model for High-Fidelity and Low-Cost Global Remote Sensing Imagery Generation | 2026 | arXiv | arXiv | GitHub |
| Uni-RS: A Spatially Faithful Unified Understanding and Generation Model for Remote Sensing | 2026 | arXiv | arXiv | N/A |
| HSIGene: A Foundation Model for Hyperspectral Image Generation | 2026 | IEEE TPAMI | DOI | GitHub |
| Extrapolate Azimuth Angles: Text and Edge Guided ISAR Image Generation Based on Foundation Model | 2026 | ISPRS JPRS | DOI | N/A |
| Text2Earth: Unlocking Text-Driven Remote Sensing Image Generation with a Global-Scale Dataset and a Foundation Model | 2025 | IEEE MGRS | DOI | GitHub |
| ZoomLDM: Latent Diffusion Model for Multi-scale Image Generation | 2025 | CVPR | CVPR | GitHub |
| MetaEarth: A Generative Foundation Model for Global-Scale Remote Sensing Image Generation | 2025 | IEEE TPAMI | DOI | GitHub |
| RSVQ-Diffusion Model for Text-to-Remote-Sensing Image Generation | 2025 | Applied Sciences | DOI | N/A |
| DiffusionSat: A Generative Foundation Model for Satellite Imagery | 2024 | ICLR | ICLR | GitHub |
| RSDiff: Remote Sensing Image Generation from Text Using Diffusion Model | 2024 | Neural Computing and Applications | DOI | N/A |
| Txt2Img-MHN: Remote Sensing Image Generation From Text Using Modern Hopfield Networks | 2023 | IEEE TIP | DOI | GitHub |
| Title | Year | Venue | Paper Link | Code Link |
|---|---|---|---|---|
| Object Fidelity Diffusion for Remote Sensing Image Generation | 2026 | ICLR | ICLR | GitHub |
| Transferable Image Synthesis for Remote Sensing Semantic Segmentation via Joint Reference-Semantic Fusion | 2026 | Information Fusion | DOI | N/A |
| VectorSynth: Fine-Grained Satellite Image Synthesis with Structured Semantics | 2026 | WACV | arXiv | GitHub |
| From Geometric Mimicry to Comprehensive Generation: A Context-Informed Multimodal Diffusion Model for Urban Morphology Synthesis | 2026 | Int. J. Geogr. Inf. Sci. | DOI | N/A |
| Any2RSI: Controllable Remote Sensing Text-to-Image Generation via Any Control and Enriched Description | 2026 | AAAI | DOI | GitHub |
| EarthSynth: Generating Informative Earth Observation with Diffusion Models | 2025 | arXiv | arXiv | GitHub |
| Task-Oriented Data Synthesis and Control-Rectify Sampling for Remote Sensing Semantic Segmentation | 2026 | CVPR | arXiv | GitHub |
| TerraGen: A Unified Multi-Task Layout Generation Framework for Remote Sensing Data Augmentation | 2025 | arXiv | arXiv | N/A |
| Cascaded Autoregressive Diffusion Models for Remote Sensing Scene Generation | 2025 | IEEE TGRS | DOI | N/A |
| AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation | 2025 | CVPR | CVPR | GitHub |
| Multi-Grained Guided Diffusion for Quantity-Controlled Remote Sensing Object Generation | 2025 | IEEE GRSL | DOI | GitHub |
| UP-Diff: Latent Diffusion Model for Remote Sensing Urban Prediction | 2025 | IEEE GRSL | DOI | GitHub |
| Spatial-Aware Remote Sensing Image Generation From Spatial Relationship Descriptions | 2025 | IEEE GRSL | DOI | N/A |
| GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis | 2024 | CVPR | CVPR | GitHub |
| SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation | 2024 | CVPR | CVPR | N/A |
| CRS-Diff: Controllable Remote Sensing Image Generation With Diffusion Model | 2024 | IEEE TGRS | DOI | GitHub |
| Diffusion-Geo: A Two-Stage Controllable Text-To-Image Generative Model for Remote Sensing Scenarios | 2024 | IGARSS | DOI | N/A |
| Efficient and Controllable Remote Sensing Fake Sample Generation Based on Diffusion Model | 2023 | IEEE TGRS | DOI | GitHub |
| Generate Your Own Scotland: Satellite Image Generation Conditioned on Maps | 2023 | NeurIPS Workshop (Diffusion Models) | GitHub | |
| UnmixDiff: Unmixing-Based Diffusion Model for Hyperspectral Image Synthesis | 2024 | IEEE TGRS | DOI | GitHub |
| Unmixing Before Fusion: A Generalized Paradigm for Multi-Source-based Hyperspectral Image Synthesis | 2024 | CVPR | CVPR | GitHub |
| Multi-Stage Convolutional Autoencoder Network for Hyperspectral Unmixing | 2022 | Int. J. Appl. Earth Obs. Geoinf. | DOI | GitHub |
| Title | Year | Venue | Paper Link | Code Link |
|---|---|---|---|---|
| Any2Any: Unified Arbitrary Modality Translation for Remote Sensing | 2026 | arXiv | arXiv | GitHub |
| FlowEO: Generative Unsupervised Domain Adaptation for Earth Observation | 2026 | WACV | arXiv | N/A |
| Adaptive Domain Shift in Diffusion Models for Cross-Modality Image Translation | 2026 | ICLR | OpenReview | GitHub |
| C-DiffSET: Leveraging Latent Diffusion for SAR-to-EO Image Translation with Confidence-Guided Reliable Object Generation | 2025 | arXiv | arXiv | GitHub |
| Efficient End-to-End Diffusion Model for One-Step SAR-to-Optical Translation | 2025 | IEEE GRSL | DOI | GitHub |
| 3rd Multi-modal Aerial View Image Challenge: Sensor Domain Translation - PBVS 2025 | 2025 | CVPRW (PBVS) | CVPRW | N/A |
| S3OIL: Semi-Supervised SAR-to-Optical Image Translation via Multi-Scale and Cross-Set Matching | 2025 | IEEE TIP | DOI | N/A |
| RLI-DM: Robust Layout-Based Iterative Diffusion Model for SAR-to-RGB Image Translation | 2025 | IEEE TGRS | DOI | N/A |
| DOGAN: DINO-Based Optical-Prior-Driven GAN for SAR-to-Optical Image Translation | 2025 | IEEE TGRS | DOI | N/A |
| CSHNet: A Novel Information Asymmetric Image Translation Method | 2026 | IEEE TCSVT | DOI | GitHub |
| Conditional Diffusion Model With Spatial-Frequency Refinement for SAR-to-Optical Image Translation | 2024 | IEEE TGRS | DOI | N/A |
| HVT-cGAN: Hybrid Vision Transformer cGAN for SAR-to-Optical Image Translation | 2025 | IEEE TGRS | DOI | N/A |
| Conditional Diffusion for SAR to Optical Image Translation | 2024 | IEEE GRSL | DOI | GitHub |
| SAR-to-Optical Image Translation With Hierarchical Latent Features | 2022 | IEEE TGRS | DOI | N/A |
| Gan-based SAR to Optical Image Translation in Fire-Disturbed Regions | 2022 | IGARSS | DOI | N/A |
| Title | Year | Venue | Paper Link | Code Link |
|---|---|---|---|---|
| HarmoniDiff-RS: Training-Free Diffusion Harmonization for Satellite Image Composition | 2026 | CVPR | arXiv | GitHub |
| DreamCD: A Change-Label-Free Framework for Change Detection via a Weakly Conditional Semantic Diffusion Model in Optical VHR Imagery | 2026 | JAG | DOI | GitHub |
| RSEdit: Text-Guided Image Editing for Remote Sensing | 2026 | arXiv | arXiv | GitHub |
| Remote Sensing-Oriented World Model | 2025 | arXiv | arXiv | N/A |
| ChangeBridge: Spatiotemporal Image Generation with Multimodal Controls for Remote Sensing | 2025 | arXiv | arXiv | N/A |
| Exploring Text-Guided Single Image Editing for Remote Sensing Images | 2025 | JSTAR | arXiv | DOI | GitHub |
| Text2Earth: Unlocking Text-Driven Remote Sensing Image Generation with a Global-Scale Dataset and a Foundation Model | 2025 | IEEE MGRS | DOI | GitHub |
| DiffusionSat: A Generative Foundation Model for Satellite Imagery | 2024 | ICLR | ICLR | GitHub |
| Title | Year | Venue | Paper Link | Code Link |
|---|---|---|---|---|
| Generating Any Changes in the Noise Domain | 2025 | IEEE TPAMI | DOI | GitHub |
| ChangeDiff: A Multi-Temporal Change Detection Data Generator with Flexible Text Prompts via Diffusion Model | 2025 | AAAI | AAAI | GitHub |
| UniTS: Unified Time Series Generative Model for Remote Sensing | 2025 | arXiv | arXiv | GitHub |
| Changen2: Multi-Temporal Remote Sensing Generative Change Foundation Model | 2024 | IEEE TPAMI | DOI | GitHub |
| Scalable Multi-Temporal Remote Sensing Change Data Generation via Simulating Stochastic Change Process | 2023 | ICCV | ICCV | GitHub |
| Title | Year | Venue | Paper Link | Code Link |
|---|---|---|---|---|
| DiffuSAM: Diffusion Guided Zero-Shot Object Grounding for Remote Sensing Imagery | 2026 | ICLR ML4RS Workshop | OpenReview | N/A |
| DreamCD: A Change-Label-Free Framework for Change Detection via a Weakly Conditional Semantic Diffusion Model in Optical VHR Imagery | 2026 | JAG | DOI | GitHub |
| DiffRegCD: Integrated Registration and Change Detection with Diffusion Features | 2026 | WACV | arXiv | GitHub |
| RemoteVAR: Autoregressive Visual Modeling for Remote Sensing Change Detection | 2026 | arXiv | arXiv | GitHub |
| TerraMind: Large-Scale Generative Multimodality for Earth Observation | 2025 | ICCV | ICCV | GitHub |
| Can Generative Geospatial Diffusion Models Excel as Discriminative Geospatial Foundation Models? | 2025 | ICCV | ICCV | GitHub |
| DDPM-CD: Denoising Diffusion Probabilistic Models as Feature Extractors for Remote Sensing Change Detection | 2025 | WACV | arXiv | GitHub |
| Mask Approximation Net: A Novel Diffusion Model Approach for Remote Sensing Change Captioning | 2025 | IEEE TGRS | DOI | GitHub |
| Diffusion-RSCC: Diffusion Probabilistic Model for Change Captioning in Remote Sensing Images | 2025 | IEEE TGRS | DOI | GitHub |
| DiffDet4SAR: Diffusion-Based Aircraft Target Detection Network for SAR Images | 2024 | IEEE GRSL | DOI | GitHub |
Important
This section mainly lists dataset papers from 2025 onward, with a few earlier references for context.
| Title | Year | Venue | Paper Link | Code Link |
|---|---|---|---|---|
| Prithvi-EO-2.0: A Versatile Multitemporal Foundation Model for Earth Observation Applications | 2026 | IEEE TGRS | DOI | GitHub |
| SARLANG-1M: A Benchmark for Vision-Language Modeling in SAR Image Understanding (SAR Vision-Language) | 2026 | IEEE TGRS | DOI | GitHub |
| CrossEarth-SAR: A SAR-Centric and Billion-Scale Geospatial Foundation Model for Domain Generalizable Semantic Segmentation | 2026 | arXiv | arXiv | GitHub |
| SOMA-1M: A Large-Scale SAR-Optical Multi-resolution Alignment Dataset for Multi-Task Remote Sensing | 2026 | arXiv | arXiv | GitHub |
| BD-CC: Keyword-Guided Building Damage Captioning for Bi-Temporal Remote Sensing Images (KGBDCNet) | 2026 | ISPRS JPRS | DOI | GitHub |
| MaRS: A Multi-Modality Very-High-Resolution Remote Sensing Foundation Model with Cross-Granularity Meta-Modality Learning | 2026 | AAAI | N/A | GitHub |
| ChatEarthBench: Benchmarking Multimodal Large Language Models for Earth Observation | 2026 | IEEE MGRS | DOI | GitHub |
| EarthVL: A Progressive Earth Vision-Language Understanding and Generation Framework | 2026 | arXiv | arXiv | GitHub |
| GAIA: A Global, Multimodal, Multiscale Vision-Language Dataset for Remote Sensing Image Analysis | 2026 | IEEE MGRS | DOI | GitHub |
| A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning | 2026 | Pattern Recognition | DOI | GitHub |
| Remote Sensing Meta Modal Representation for Missing Modality Land Cover Mapping: From EarthMiss Dataset to MetaRS Method | 2025 | Remote Sensing of Environment | DOI | GitHub |
| TerraMesh: A Planetary Mosaic of Multimodal Earth Observation Data (SSL, Temporal, Multi-Sensor) | 2025 | CVPR Workshop | CVPRW | HF |
| Sky-SA (SkySense-O): Fine-Grained Open-World Remote Sensing Interpretation Dataset (Vision-Language) | 2025 | CVPR | CVPR | GitHub |
| EarthInstruct (InstructSAM): Instruction-Oriented Object Counting, Detection and Segmentation Benchmark | 2025 | NeurIPS | OpenReview | GitHub |
| SAR-TEXT: A Large-Scale SAR Image-Text Dataset Built with SAR-Narrator and Progressive Transfer Learning (SAR Vision-Language) | 2025 | arXiv | arXiv | GitHub |
| Galileo: Learning Global & Local Features of Many Remote Sensing Modalities | 2025 | ICML | ICML | GitHub |
| EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision (SSL, Temporal, Multi-Sensor) | 2025 | WACV Workshop | WACVW | GitHub |
| Towards a Unified Copernicus Foundation Model for Earth Vision | 2025 | ICCV | ICCV | GitHub |
| OpenEarthMap-SAR: A Benchmark Synthetic Aperture Radar Dataset for Global High-Resolution Land Cover Mapping (SAR Segmentation Maps) | 2025 | IEEE MGRS | DOI | GitHub |
| BRIGHT: A Globally Distributed Multimodal Building Damage Assessment Dataset with Very-High-Resolution for All-Weather Disaster Response (Optical-SAR Pairs) | 2025 | Earth System Science Data | DOI | GitHub |
| A Large-Scale Image–Text Dataset Benchmark for Farmland Segmentation | 2025 | Earth System Science Data | DOI | GitHub |
| CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models | 2025 | NeurIPS D&B | OpenReview | GitHub |
| Constructing an Extensible Building Damage (EBD) Dataset via Semi-Supervised Fine-Tuning Across 12 Natural Disasters | 2025 | Journal of Remote Sensing | DOI | N/A |
| DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response | 2025 | NeurIPS D&B | OpenReview | GitHub |
| JL1-CD: A New Benchmark for Remote Sensing Change Detection and a Robust Multi-Teacher Knowledge Distillation Framework | 2025 | arXiv | arXiv | GitHub |
| RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark | 2025 | CVPR | DOI | GitHub |
| RSCC: A Large-Scale Remote Sensing Change Caption Dataset for Disaster Events | 2025 | NeurIPS D&B | OpenReview | GitHub |
| RSGPT: A Remote Sensing Vision Language Model and Benchmark | 2025 | ISPRS JPRS | DOI | GitHub |
| RSVLM-QA: A Benchmark Dataset for Remote Sensing Vision Language Model-Based Question Answering | 2025 | ACM MM | DOI | GitHub |
| S-EO: A Large-Scale Dataset for Geometry-Aware Shadow Detection in Remote Sensing Applications | 2025 | CVPR Workshop | CVPRW | N/A |
| Git-10M (Text2Earth): Text-Driven Remote Sensing Image Generation Dataset | 2025 | IEEE MGRS | DOI | GitHub |
| reBen: Refined BigEarthNet Dataset for Remote Sensing Image Analysis | 2025 | IGARSS | DOI | GitHub |
| MMEarth: Exploring Multi-modal Pretext Tasks for Geospatial Representation Learning | 2024 | ECCV | ECCV | GitHub |
| OpenEarthMap: A Benchmark Dataset for Global High-Resolution Land Cover Mapping | 2023 | WACV | WACV | GitHub |
Metrics are classified by whether they require paired reference images: paired metrics compare generated output to a reference (e.g., image translation, restoration), while unpaired metrics compare distributions of real vs. generated samples (e.g., unconditional generation).
| Title | Abbr | Year | Venue | Paper Link | Code Link |
|---|---|---|---|---|---|
| SAMScore: A Content Structural Similarity Metric for Image Translation Evaluation | SAMScore | 2025 | IEEE TAI | DOI | GitHub |
| Image Quality Assessment: Unifying Structure and Texture Similarity | DISTS | 2022 | IEEE TPAMI | DOI | GitHub |
| The Unreasonable Effectiveness of Deep Features as a Perceptual Metric | LPIPS | 2018 | CVPR | CVPR | GitHub |
| Image Quality Assessment: From Error Visibility to Structural Similarity | PSNR, SSIM | 2004 | IEEE TIP | DOI | GitHub |
| Title | Abbr | Year | Venue | Paper Link | Code Link |
|---|---|---|---|---|---|
| Making Reconstruction FID Predictive of Diffusion Generation FID | iFID | 2026 | arXiv | arXiv | GitHub |
| Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation | FWD | 2025 | ICLR | arXiv | GitHub |
| Exploring the Design Space of Diffusion Bridge Models | AFD | 2025 | NeurIPS | OpenReview | GitHub |
| Rethinking FID: Towards a Better Evaluation Metric for Image Generation | CMMD | 2024 | CVPR | CVPR | GitHub |
| The Vendi Score: A Diversity Evaluation Metric for Machine Learning | VS | 2023 | TMLR | arXiv | GitHub |
| Generating Images with Sparse Representations | sFID | 2021 | ICML | PMLR | N/A |
| Simplified Fréchet Distance for Generative Adversarial Nets | SFD | 2020 | Sensors | DOI | N/A |
| Improved Precision and Recall Metric for Assessing Generative Models | P&R | 2019 | NeurIPS | NeurIPS | GitHub |
| Demystifying MMD GANs | KID | 2018 | ICLR | OpenReview | GitHub |
| GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium | FID | 2017 | NeurIPS | NeurIPS | GitHub |
| Improved Techniques for Training GANs | IS | 2016 | NeurIPS | NeurIPS | GitHub |
Contributions are welcome! You can either give a pull request or create an issue.
To the extent possible under law, the maintainers have waived all copyright and related or neighboring rights to this work. See the LICENSE file for the full legal text.