Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
aa3e925
update the skeleton
lgyts Oct 29, 2025
da91f04
split the data to balance the ratio
lgyts Oct 31, 2025
bef99a9
add Triplet dataset utils
lgyts Oct 31, 2025
08473e6
update the image transform method
lgyts Nov 1, 2025
eff8fc0
update the loader for test
lgyts Nov 1, 2025
0325de8
update params.py which restore all parameters
lgyts Nov 1, 2025
66426ca
use ResNet50 as backbone for feature extraction
lgyts Nov 1, 2025
c6be584
add 4 layer MLP for binary classification, with LeakyReLU activations
lgyts Nov 1, 2025
40365b3
change the hyperparams and create the train_siamese skeleton
lgyts Nov 1, 2025
6966a33
extract features using trained encoder
lgyts Nov 1, 2025
857b570
train classifier on extracted features
lgyts Nov 1, 2025
dfa4627
create ignore file for data files, and add main function for train.py
lgyts Nov 1, 2025
5711b10
fix the path of dataset, change the display of progress check
lgyts Nov 1, 2025
a991d76
ignore the model file and change the datapath
lgyts Nov 1, 2025
d186476
finish the predict file, load models then predict the class
lgyts Nov 1, 2025
2ef33a3
increase the epoch number
lgyts Nov 2, 2025
e55e170
restructure the file add the commit and move plot function to utils
lgyts Nov 2, 2025
ae5ea4e
first darft of readme
lgyts Nov 2, 2025
4b506fb
upload the images and add the header
lgyts Nov 2, 2025
671ffb9
final update of readme
lgyts Nov 2, 2025
540ef33
Revert "increase the epoch number"
lgyts Nov 3, 2025
849f8f5
backroll to increase epoch patch
lgyts Nov 3, 2025
d52d9e6
add triplet_val in dataset for train to print after each epoch
lgyts Nov 3, 2025
de5cb7d
add the plot for siamese loss and classify loss
lgyts Nov 3, 2025
c1713cd
put plot function in utils.py
lgyts Nov 3, 2025
92b1b45
add color jitter for increased robustness
lgyts Nov 3, 2025
8824dd2
add dropout in classifier try to reduce overfitting
lgyts Nov 3, 2025
74ef342
add L2 normalization and set early stop CLS to prevent overfitting
lgyts Nov 3, 2025
3d891ec
decrease dimension from 1000 to 512
lgyts Nov 3, 2025
3cc79fd
increase max epoch and add the early stop methods for both
lgyts Nov 3, 2025
ccd02d8
update input, output images, split predict functionality from train.p…
lgyts Nov 3, 2025
239b92c
move extract_features() from predict.py and train.py into utils.py
lgyts Nov 3, 2025
f11d27f
centralize and standardize global parameters in params.py, add docstr…
lgyts Nov 3, 2025
bb3aa9d
first draft of Readme
lgyts Nov 3, 2025
342d086
add the images for readme
lgyts Nov 3, 2025
468a688
final draft
lgyts Nov 3, 2025
2998e92
combine future work and summary in readme
lgyts Nov 3, 2025
f8f9326
fix typo in readme
lgyts Nov 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 0 additions & 10 deletions recognition/README.md

This file was deleted.

5 changes: 5 additions & 0 deletions recognition/siamese/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#ignore dataset files
dataset/
test.py
__pycache__/
models/
262 changes: 262 additions & 0 deletions recognition/siamese/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,262 @@
# Siamese Network for ISIC 2020 Skin Lesion Classification
**Author:** s4778251

<p align="center">
<img src="./images/Siamese Network.webp" width="400">
</p>

## Description

This repository implements a **Siamese Network** for **binary classification** of dermoscopic images from the **ISIC 2020 Challenge** dataset (melanoma vs. benign).
The approach first trains a **Siamese encoder** using **Triplet Margin Loss** to learn a discriminative embedding space, and then trains a **binary classifier** (4-layer MLP) on top of frozen embeddings for final predictions.
The implementation follows a modular design, with configuration centralized in `params.py`, dataset management in `dataset.py`, and the main training logic in `train.py`.



## How It Works

### Siamese Encoder
- Backbone: **ResNet-50** pretrained on ImageNet.
- The final fully connected layer is replaced by a **512-dimensional projection head**.
- Embeddings are **L2-normalized** to enforce metric consistency.
- Optimized with **Triplet Margin Loss**, which minimizes the distance between anchor-positive pairs and maximizes distance to negatives.

### Binary Classifier
- Takes embeddings extracted from the Siamese encoder as input.
- Composed of two hidden layers: 256 → 64 units.
- Uses **LeakyReLU activation** and **Dropout (p=0.4)** for regularization.
- Trained with **CrossEntropyLoss** to distinguish between benign and malignant samples.

### Evaluation
- After training, the encoder and classifier are evaluated on the test set.
- The model reports overall accuracy, confusion matrix, and per-class precision, recall, and F1-score.
- All plots (training curves, confusion matrix) are saved under `./images/`.



## Project Structure

```
siamese/
├── dataset.py # Data loading and preprocessing pipeline
├── modules.py # Model definitions (SiameseEncoder, BinaryClassifier)
├── train.py # Training pipeline for Siamese and classifier networks
├── predict.py # Evaluation and testing (confusion matrix, metrics)
├── utils.py # Utility functions for plotting, saving samples, feature extraction, etc.
├── params.py # Global configuration (hyperparameters, paths, augmentation, etc.)
└── models/ # Folder for saved models (.pth)
├── siamese.pth
├── classifier.pth
└── images/ # Folder for saved output figures
├── siamese_loss.png
├── classifier_loss.png
├── confusion_matrix.png
└── input_sample.png
└── dataset/ # Dataset
├── train-image/
├── train-metadata.csv
```


## File Explanations

- **params.py** – Stores all global variables and hyperparameters, including dataset paths, image preprocessing, model dimensions, and training settings.
- **dataset.py** – Defines dataset classes, data augmentation, and loaders for both triplet and classification tasks.
- **modules.py** – Contains the model definitions: the Siamese encoder (ResNet-50) and binary classifier (4-layer MLP).
- **utils.py** – Includes helper functions for plotting, saving figures, feature extraction, and directory creation.
- **train.py** – Main training script that trains the Siamese encoder, extracts embeddings, and trains the classifier.
- **predict.py** – Evaluation script that loads trained models, computes predictions, and saves the confusion matrix.



## Dependencies
```
Tested on Google Colab (CUDA 12.6).

| Package | Version |
|----------------|----------------|
| torch | 2.8.0+cu126 |
| torchvision | 0.23.0+cu126 |
| numpy | 2.0.2 |
| pandas | 2.2.2 |
| matplotlib | 3.10.0 |
| scikit-learn | 1.6.1 |
```


## Data Preprocessing

- Input: **256×256 RGB** dermoscopic images (`train-image/`)
- Metadata: `train-metadata.csv` (containing `isic_id`, `patient_id`, `target`)
- Split: **70% train / 10% validation / 20% test**, grouped by patient ID to prevent data leakage.
- Normalization: `mean = [0.5, 0.5, 0.5]`, `std = [0.5, 0.5, 0.5]`.
- Augmentation: random rotations, color jitter, horizontal/vertical flips.

All preprocessing configurations and split ratios are defined in `params.py` for reproducibility.

### Justification of Data Splits
A 70 / 10 / 20 (train / validation / test) split was selected to maintain a balance between model generalization and evaluation stability.
Group-based splitting by `patient_id` prevents data leakage between training and test sets, as multiple images can originate from the same patient.



## Training and Testing

All experiments were conducted in **Google Colab A100**.
Before running, ensure that the working directory is correctly set to the project folder.


### Train Both Networks
```
%cd /content/siamese
!python train.py
```

#### This command will:
- Train the Siamese encoder using **Triplet Margin Loss**
- Extract embeddings from the encoder
- Train the binary classifier using **CrossEntropyLoss**
- Save model weights and training plots under `./models/` and `./images/`


### Evaluate on Test Set
```
%cd /content/siamese
!python predict.py
```

#### This command loads the trained models and:
- Evaluates performance on the test dataset
- Computes accuracy, precision, recall, and F1-score
- Generates and saves the confusion matrix as `./images/confusion_matrix.png`



## Visual Results

**1. Siamese Network Training Loss**
<p align="center">
<img src="./images/siamese_loss.png" width="450">
</p>
The triplet loss of the Siamese encoder steadily decreases during training, showing that the network effectively learns to minimize distances between similar image pairs while separating dissimilar ones.

---

**2. Binary Classifier Loss**
<p align="center">
<img src="./images/classifier_loss.png" width="450">
</p>
The CrossEntropy loss for both training and validation sets consistently declines, indicating stable convergence.
Validation loss flattens near the end, suggesting moderate generalization with minimal overfitting.

---

**3. Confusion Matrix**
<p align="center">
<img src="./images/confusion_matrix.png" width="350">
</p>
The confusion matrix demonstrates that the classifier correctly identifies most benign and malignant lesions.
Diagonal dominance confirms strong predictive performance and well-learned decision boundaries.

---

**Sample Input Example**
<p align="center">
<img src="./images/input_sample.png" width="220">
</p>
This sample dermoscopic image was randomly **rotated** and **color-adjusted** as part of data augmentation.
Such transformations increase dataset diversity and improve model robustness to variations in image orientation and illumination.



## Training & Evaluation Logs

Below are condensed console outputs from **train.py** and **predict.py**.
They demonstrate proper training convergence, early stopping, and final evaluation results.

### Training Log (`train.py`)
The Siamese encoder stops early due to validation loss plateauing,
while the classifier converges smoothly to around **82% validation accuracy**.

```
Device: cuda
[INFO] Loaded 33126 samples from train-metadata.csv
[Siamese] Epoch 1/100 train_loss=0.9653 val_loss=0.8922
[Siamese] Epoch 2/100 train_loss=0.8287 val_loss=0.6524
[Siamese] Epoch 3/100 train_loss=0.6778 val_loss=0.6933
[Siamese] Epoch 4/100 train_loss=0.5562 val_loss=0.6903
.
.
.
[Siamese] Early stopping at epoch 14
[INFO] Saved final Siamese encoder (stopped model).
[INFO] Extracting embeddings...
[Extract] 100.0% complete
[CLS] Epoch 1/80 train_loss=0.6952 val_loss=0.6876 val_acc=50.00%
[CLS] Epoch 5/80 train_loss=0.6495 val_loss=0.6600 val_acc=50.00%
[CLS] Epoch 10/80 train_loss=0.5977 val_loss=0.6255 val_acc=81.63%
[CLS] Epoch 20/80 train_loss=0.4247 val_loss=0.5239 val_acc=81.63%
[CLS] Epoch 28/80 train_loss=0.2580 val_loss=0.4580 val_acc=82.65%
[CLS] Epoch 33/80 train_loss=0.1685 val_loss=0.4575 val_acc=82.65%
[CLS] Early stopping at epoch 35
[INFO] Saved final classifier (stopped model).
[INFO] Training finished. All results saved to ./images
```


### Evaluation Log (`predict.py`)
After loading trained models, the classifier achieved 81% test accuracy with balanced precision and recall.

```
/content/siamese
Device: cuda
[INFO] Loaded 33126 samples from train-metadata.csv
[INFO] Extracting test features...
[Extract] 100.0% complete
[TEST] Accuracy: 80.51%
[TEST] Confusion Matrix:
[[113 23]
[ 30 106]]

[TEST] Classification Report:
precision recall f1-score support
benign(0) 0.80 0.82 0.81 136
malignant(1) 0.81 0.79 0.80 136
accuracy 0.81 272
macro avg 0.81 0.81 0.81 272
weighted avg 0.81 0.81 0.81 272

[INFO] Saved confusion_matrix.png to: ./images
```



## Discussion and Future Work

The Siamese encoder successfully learned a discriminative embedding space, as reflected by the steadily decreasing triplet loss during training.
However, the validation loss showed noticeable oscillation, suggesting that the triplet sampling strategy may not consistently produce informative anchor–positive–negative pairs.
While the classifier achieved stable convergence and balanced performance (precision and recall ≈ 0.8), the overall accuracy plateaued around 81–82%, indicating that generalization to unseen samples remains limited.

Several factors may explain these observations:
- The dataset exhibits **class imbalance** and **intra-class variability**, which can make triplet formation unstable.
- The **triplet margin** and **sampling strategy** were fixed throughout training, potentially limiting the diversity of hard examples.

**Future Work**
- Implement **hard or semi-hard negative mining** to improve triplet selection and reduce validation fluctuation.
- Explore **alternative metric learning losses** (e.g., ArcFace, Contrastive Loss) to enhance inter-class margins and improve embedding quality.


## References

1. **ISIC 2020 Challenge Dataset** – *SIIM-ISIC Melanoma Classification* (Kaggle):
https://www.kaggle.com/datasets/nischaydnk/isic-2020-jpg-256x256-resized/data

2. **Triplet Margin Loss (PyTorch Documentation)** –
https://pytorch.org/docs/stable/generated/torch.nn.TripletMarginLoss.html

3. **CrossEntropy Loss (PyTorch Documentation)** –
https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html

4. **G. Koch, R. Zemel, R. Salakhutdinov et al.**,
*Siamese Neural Networks for One-Shot Image Recognition*,
in *ICML Deep Learning Workshop*, 2015.
Loading