Results

Thermal Chameleon: Task-Adaptive Tone-mapping for Radiometric Thermal-Infrared images

Official Repository for "Thermal Chameleon Net: Task-Adaptive Tone-mapping for Thermal-Infrared images", Robotics and Automation Letters (RA-L).

Dong-Guw Lee, Jeongyun Kim, Younggun Cho, Ayoung Kim

Above is a picture of thermal chameleon that we've made using Dall-E.

TLDR: We propose a new task-adaptive learnable tone-mapping network for thermal infrared images from 14-bit (RAW) thermal infrared images.

Abstract

Thermal Infrared (TIR) imaging provides robust perception for navigating in challenging outdoor environments but faces issues with poor texture and low image contrast due to its 14/16-bit format. Conventional methods utilize various tone-mapping methods to enhance contrast and photometric consistency of TIR images, however, the choice of tone-mapping is largely dependent on knowing the task and temperature dependent priors to work well. In this paper, we present Thermal Chameleon Network (TCNet), a task-adaptive tone- mapping approach for RAW 14-bit TIR images. Given the same image, TCNet tone-maps different representations of TIR images tailored for each specific task, eliminating the heuristic image rescaling preprocessing and reliance on the extensive prior knowledge of the scene temperature or task-specific characteris- tics. TCNet exhibits improved generalization performance across object detection and monocular depth estimation, with minimal computational overhead and modular integration to existing architectures for various tasks.

Too long to read? Here's a TL;DR

Don't spend time on tone-mapping thermal images that would work well for all tasks, instead let the network do it for you, optimized for each task!

Overview of Thermal Chameleon

Just like the name states, our work is aimed at creating object detection adaptive network from 14-bit thermal images.

Our method is divided into two stages:

Multichannel thermal embedding: Essentially a tool to represent each absolute temperature value (in Celsius) to a set feature vectors.
Adaptive channel compression network: Employing lots of multichannel embeddings always don't work and it even incurs high computational cost. More importantly, we can't use transfer learning this way as they are optimized for 3 channel inputs. This essentially enables all those operations by compressing only valid features for object detection in three channel representations.

In essence, what really happens is that we assign task-adaptive weights to each thermal embedding, optimized and controlled by the loss functions of the downstream task.

Results

Quantitative Results on object detection

FLIR-ADAS Dataset

Zero-shot detection on various public dataset

Qualitative on depth estimation

VIVID Dataset

Zero-shot detection on FLIR-ADAS/STheReO

Training Details

Basic setting

All settings use ResNet50 as backbones unless specified
All models were trained using Nvidia RTX 4090 / RTX-A6000 for YOLOX
All models were trained for 500 epochs with weights being saved for each epoch. We took the best epoch based on the validation set.

Object detection

RetinaNet

Warm up epoch: 10
Batch size: 16
Optimizer: AdamW
Base lr: $1.5 \times 10^{-4}$
Scheduler: Cosine annealing
Data augmentation: Random horizontal flip
Pretraining?: No (Trained from scratch)

YOLOX

Warm up epoch: 5
Batch size: 32
Optimizer: SGD with momentum of 0.9
Weight decay: 0.05
Base lr: $1.5625 \times 10^{-4}$
Scheduler: Cosine annealing
Data augmentation: Random horizontal flip, Random mosaic, Random mixup
Pretraining?: No (Trained from scratch) Pretty much all settings are identical to original YOLO-X implementations.

Sparse-RCNN

Implemented on MMDetection

Warm up iterations: 1000 iterations
Batch size: 16
Optimizer: AdamW
Weight decay: 0.0001
Base lr: $2.5 \times 10^{-4}$
Scheduler: Cosine annealing
Data augmentation: Random horizontal flip, Random mosaic, Random mixup
Pretraining?: Yes (ImageNet pretraining). For Thermal embedding, we averaged out the 3 channel weights and copied it to all channels for the first conv layer.

Depth estimation

Monodepth-Thermal

Batch size: 4
Optimizer: Adam
Base lr: $1.5 \times 10^{-4}$
Scheduler: Cosine annealing
Data augmentation: Random horizontal flip/Random crop
Pretraining?: Yes (ImageNet Pretraining)

Followed all protocols and most settings used in this repo: https://github.com/UkcheolShin/ThermalMonoDepth

Usage

Will be announced after review period

Citation

Please consider citing the paper as:

@ARTICLE {dglee-2024-tcnet,
    AUTHOR = { Dong-Guw Lee and Jeongyun Kim and Younggun Cho and Ayoung Kim },
    TITLE = { Thermal Chameleon: Task-Adaptive Tone-mapping for Radiometric Thermal-Infrared images },
    JOURNAL = {IEEE Robotics and Automation Letters (RA-L) },
    YEAR = { 2024 },
}

Contact

If you have any urgent questions or issues that need to be resolved, please contact me by email.

donkeymouse@snu.ac.kr

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.gitignore		.gitignore
FLIR_test_raw_corrected.json		FLIR_test_raw_corrected.json
LICENSE		LICENSE
README.md		README.md
sample.png		sample.png
thermal_chameleon.py		thermal_chameleon.py
thermal_embedding.py		thermal_embedding.py
thermal_embedding_np.py		thermal_embedding_np.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Thermal Chameleon: Task-Adaptive Tone-mapping for Radiometric Thermal-Infrared images

Abstract

Overview of Thermal Chameleon

Results

Quantitative Results on object detection

Qualitative on depth estimation

Training Details

Basic setting

Object detection

Depth estimation

Usage

Will be announced after review period

Citation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Thermal Chameleon: Task-Adaptive Tone-mapping for Radiometric Thermal-Infrared images

Abstract

Overview of Thermal Chameleon

Results

Quantitative Results on object detection

Qualitative on depth estimation

Training Details

Basic setting

Object detection

Depth estimation

Usage

Will be announced after review period

Citation

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages