diff --git a/recognition/README.md b/recognition/README.md
deleted file mode 100644
index 32c99e899..000000000
--- a/recognition/README.md
+++ /dev/null
@@ -1,10 +0,0 @@
-# Recognition Tasks
-Various recognition tasks solved in deep learning frameworks.
-
-Tasks may include:
-* Image Segmentation
-* Object detection
-* Graph node classification
-* Image super resolution
-* Disease classification
-* Generative modelling with StyleGAN and Stable Diffusion
\ No newline at end of file
diff --git a/recognition/siamese/.gitignore b/recognition/siamese/.gitignore
new file mode 100644
index 000000000..1bc2eea3a
--- /dev/null
+++ b/recognition/siamese/.gitignore
@@ -0,0 +1,5 @@
+#ignore dataset files
+dataset/
+test.py
+__pycache__/
+models/
\ No newline at end of file
diff --git a/recognition/siamese/README.md b/recognition/siamese/README.md
new file mode 100644
index 000000000..0a5bb8261
--- /dev/null
+++ b/recognition/siamese/README.md
@@ -0,0 +1,262 @@
+# Siamese Network for ISIC 2020 Skin Lesion Classification
+**Author:** s4778251
+
+
+
+
+
+## Description
+
+This repository implements a **Siamese Network** for **binary classification** of dermoscopic images from the **ISIC 2020 Challenge** dataset (melanoma vs. benign).
+The approach first trains a **Siamese encoder** using **Triplet Margin Loss** to learn a discriminative embedding space, and then trains a **binary classifier** (4-layer MLP) on top of frozen embeddings for final predictions.
+The implementation follows a modular design, with configuration centralized in `params.py`, dataset management in `dataset.py`, and the main training logic in `train.py`.
+
+
+
+## How It Works
+
+### Siamese Encoder
+- Backbone: **ResNet-50** pretrained on ImageNet.
+- The final fully connected layer is replaced by a **512-dimensional projection head**.
+- Embeddings are **L2-normalized** to enforce metric consistency.
+- Optimized with **Triplet Margin Loss**, which minimizes the distance between anchor-positive pairs and maximizes distance to negatives.
+
+### Binary Classifier
+- Takes embeddings extracted from the Siamese encoder as input.
+- Composed of two hidden layers: 256 → 64 units.
+- Uses **LeakyReLU activation** and **Dropout (p=0.4)** for regularization.
+- Trained with **CrossEntropyLoss** to distinguish between benign and malignant samples.
+
+### Evaluation
+- After training, the encoder and classifier are evaluated on the test set.
+- The model reports overall accuracy, confusion matrix, and per-class precision, recall, and F1-score.
+- All plots (training curves, confusion matrix) are saved under `./images/`.
+
+
+
+## Project Structure
+
+```
+siamese/
+├── dataset.py # Data loading and preprocessing pipeline
+├── modules.py # Model definitions (SiameseEncoder, BinaryClassifier)
+├── train.py # Training pipeline for Siamese and classifier networks
+├── predict.py # Evaluation and testing (confusion matrix, metrics)
+├── utils.py # Utility functions for plotting, saving samples, feature extraction, etc.
+├── params.py # Global configuration (hyperparameters, paths, augmentation, etc.)
+└── models/ # Folder for saved models (.pth)
+ ├── siamese.pth
+ ├── classifier.pth
+└── images/ # Folder for saved output figures
+ ├── siamese_loss.png
+ ├── classifier_loss.png
+ ├── confusion_matrix.png
+ └── input_sample.png
+└── dataset/ # Dataset
+ ├── train-image/
+ ├── train-metadata.csv
+```
+
+
+## File Explanations
+
+- **params.py** – Stores all global variables and hyperparameters, including dataset paths, image preprocessing, model dimensions, and training settings.
+- **dataset.py** – Defines dataset classes, data augmentation, and loaders for both triplet and classification tasks.
+- **modules.py** – Contains the model definitions: the Siamese encoder (ResNet-50) and binary classifier (4-layer MLP).
+- **utils.py** – Includes helper functions for plotting, saving figures, feature extraction, and directory creation.
+- **train.py** – Main training script that trains the Siamese encoder, extracts embeddings, and trains the classifier.
+- **predict.py** – Evaluation script that loads trained models, computes predictions, and saves the confusion matrix.
+
+
+
+## Dependencies
+```
+Tested on Google Colab (CUDA 12.6).
+
+| Package | Version |
+|----------------|----------------|
+| torch | 2.8.0+cu126 |
+| torchvision | 0.23.0+cu126 |
+| numpy | 2.0.2 |
+| pandas | 2.2.2 |
+| matplotlib | 3.10.0 |
+| scikit-learn | 1.6.1 |
+```
+
+
+## Data Preprocessing
+
+- Input: **256×256 RGB** dermoscopic images (`train-image/`)
+- Metadata: `train-metadata.csv` (containing `isic_id`, `patient_id`, `target`)
+- Split: **70% train / 10% validation / 20% test**, grouped by patient ID to prevent data leakage.
+- Normalization: `mean = [0.5, 0.5, 0.5]`, `std = [0.5, 0.5, 0.5]`.
+- Augmentation: random rotations, color jitter, horizontal/vertical flips.
+
+All preprocessing configurations and split ratios are defined in `params.py` for reproducibility.
+
+### Justification of Data Splits
+A 70 / 10 / 20 (train / validation / test) split was selected to maintain a balance between model generalization and evaluation stability.
+Group-based splitting by `patient_id` prevents data leakage between training and test sets, as multiple images can originate from the same patient.
+
+
+
+## Training and Testing
+
+All experiments were conducted in **Google Colab A100**.
+Before running, ensure that the working directory is correctly set to the project folder.
+
+
+### Train Both Networks
+```
+%cd /content/siamese
+!python train.py
+```
+
+#### This command will:
+- Train the Siamese encoder using **Triplet Margin Loss**
+- Extract embeddings from the encoder
+- Train the binary classifier using **CrossEntropyLoss**
+- Save model weights and training plots under `./models/` and `./images/`
+
+
+### Evaluate on Test Set
+```
+%cd /content/siamese
+!python predict.py
+```
+
+#### This command loads the trained models and:
+- Evaluates performance on the test dataset
+- Computes accuracy, precision, recall, and F1-score
+- Generates and saves the confusion matrix as `./images/confusion_matrix.png`
+
+
+
+## Visual Results
+
+**1. Siamese Network Training Loss**
+
+
+
+The triplet loss of the Siamese encoder steadily decreases during training, showing that the network effectively learns to minimize distances between similar image pairs while separating dissimilar ones.
+
+---
+
+**2. Binary Classifier Loss**
+
+
+
+The CrossEntropy loss for both training and validation sets consistently declines, indicating stable convergence.
+Validation loss flattens near the end, suggesting moderate generalization with minimal overfitting.
+
+---
+
+**3. Confusion Matrix**
+
+
+
+The confusion matrix demonstrates that the classifier correctly identifies most benign and malignant lesions.
+Diagonal dominance confirms strong predictive performance and well-learned decision boundaries.
+
+---
+
+**Sample Input Example**
+
+
+
+This sample dermoscopic image was randomly **rotated** and **color-adjusted** as part of data augmentation.
+Such transformations increase dataset diversity and improve model robustness to variations in image orientation and illumination.
+
+
+
+## Training & Evaluation Logs
+
+Below are condensed console outputs from **train.py** and **predict.py**.
+They demonstrate proper training convergence, early stopping, and final evaluation results.
+
+### Training Log (`train.py`)
+The Siamese encoder stops early due to validation loss plateauing,
+while the classifier converges smoothly to around **82% validation accuracy**.
+
+```
+Device: cuda
+[INFO] Loaded 33126 samples from train-metadata.csv
+[Siamese] Epoch 1/100 train_loss=0.9653 val_loss=0.8922
+[Siamese] Epoch 2/100 train_loss=0.8287 val_loss=0.6524
+[Siamese] Epoch 3/100 train_loss=0.6778 val_loss=0.6933
+[Siamese] Epoch 4/100 train_loss=0.5562 val_loss=0.6903
+.
+.
+.
+[Siamese] Early stopping at epoch 14
+[INFO] Saved final Siamese encoder (stopped model).
+[INFO] Extracting embeddings...
+[Extract] 100.0% complete
+[CLS] Epoch 1/80 train_loss=0.6952 val_loss=0.6876 val_acc=50.00%
+[CLS] Epoch 5/80 train_loss=0.6495 val_loss=0.6600 val_acc=50.00%
+[CLS] Epoch 10/80 train_loss=0.5977 val_loss=0.6255 val_acc=81.63%
+[CLS] Epoch 20/80 train_loss=0.4247 val_loss=0.5239 val_acc=81.63%
+[CLS] Epoch 28/80 train_loss=0.2580 val_loss=0.4580 val_acc=82.65%
+[CLS] Epoch 33/80 train_loss=0.1685 val_loss=0.4575 val_acc=82.65%
+[CLS] Early stopping at epoch 35
+[INFO] Saved final classifier (stopped model).
+[INFO] Training finished. All results saved to ./images
+```
+
+
+### Evaluation Log (`predict.py`)
+After loading trained models, the classifier achieved 81% test accuracy with balanced precision and recall.
+
+```
+/content/siamese
+Device: cuda
+[INFO] Loaded 33126 samples from train-metadata.csv
+[INFO] Extracting test features...
+[Extract] 100.0% complete
+[TEST] Accuracy: 80.51%
+[TEST] Confusion Matrix:
+ [[113 23]
+ [ 30 106]]
+
+[TEST] Classification Report:
+ precision recall f1-score support
+ benign(0) 0.80 0.82 0.81 136
+malignant(1) 0.81 0.79 0.80 136
+ accuracy 0.81 272
+ macro avg 0.81 0.81 0.81 272
+weighted avg 0.81 0.81 0.81 272
+
+[INFO] Saved confusion_matrix.png to: ./images
+```
+
+
+
+## Discussion and Future Work
+
+The Siamese encoder successfully learned a discriminative embedding space, as reflected by the steadily decreasing triplet loss during training.
+However, the validation loss showed noticeable oscillation, suggesting that the triplet sampling strategy may not consistently produce informative anchor–positive–negative pairs.
+While the classifier achieved stable convergence and balanced performance (precision and recall ≈ 0.8), the overall accuracy plateaued around 81–82%, indicating that generalization to unseen samples remains limited.
+
+Several factors may explain these observations:
+- The dataset exhibits **class imbalance** and **intra-class variability**, which can make triplet formation unstable.
+- The **triplet margin** and **sampling strategy** were fixed throughout training, potentially limiting the diversity of hard examples.
+
+**Future Work**
+- Implement **hard or semi-hard negative mining** to improve triplet selection and reduce validation fluctuation.
+- Explore **alternative metric learning losses** (e.g., ArcFace, Contrastive Loss) to enhance inter-class margins and improve embedding quality.
+
+
+## References
+
+1. **ISIC 2020 Challenge Dataset** – *SIIM-ISIC Melanoma Classification* (Kaggle):
+ https://www.kaggle.com/datasets/nischaydnk/isic-2020-jpg-256x256-resized/data
+
+2. **Triplet Margin Loss (PyTorch Documentation)** –
+ https://pytorch.org/docs/stable/generated/torch.nn.TripletMarginLoss.html
+
+3. **CrossEntropy Loss (PyTorch Documentation)** –
+ https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html
+
+4. **G. Koch, R. Zemel, R. Salakhutdinov et al.**,
+ *Siamese Neural Networks for One-Shot Image Recognition*,
+ in *ICML Deep Learning Workshop*, 2015.
diff --git a/recognition/siamese/dataset.py b/recognition/siamese/dataset.py
new file mode 100644
index 000000000..1410122ec
--- /dev/null
+++ b/recognition/siamese/dataset.py
@@ -0,0 +1,314 @@
+# ISIC 2020 (preprocessed, 256x256) dataset utils
+# Provides:
+# - ISICTable: load and split metadata table
+# - ISICImageDataset: standard image dataset for classification
+# - ISICTripletDataset: triplet dataset for siamese training
+# - get_loaders: prepare DataLoader objects for training and evaluation
+# Author: s4778251
+
+import os
+import random
+from pathlib import Path
+from typing import Tuple, Optional, List
+import pandas as pd
+from PIL import Image
+from sklearn.model_selection import StratifiedShuffleSplit, GroupShuffleSplit
+import torch
+from torch.utils.data import Dataset, DataLoader
+import torchvision.transforms as T
+
+from params import (
+ DATAPATH, CSV_NAME, IMG_DIR, SEED,
+ TRAIN_FRAC, VAL_FRAC, TEST_FRAC, USE_GROUP_SPLIT,
+ BATCH_TRIPLET, BATCH_CLASSIF, NUM_WORKERS,
+ MEAN, STD, IMAGE_SIZE, ROT_DEG, FLIP_PROB, COLOR_JITTER
+)
+
+
+class ISICTable:
+ """Handle ISIC2020 metadata table loading, cleaning, and splitting."""
+
+ def __init__(self, root: str, csv_name: str = CSV_NAME, image_dir: str = IMG_DIR):
+ """Load and preprocess ISIC metadata table.
+
+ Args:
+ root (str): Root directory containing CSV and image folder.
+ csv_name (str): Name of the CSV file with metadata.
+ image_dir (str): Subdirectory containing images.
+ """
+ self.root = Path(root)
+ df = pd.read_csv(self.root / csv_name)
+
+ # Remove unnamed index column if present (common artifact from CSV export)
+ if df.columns[0].lower().startswith("unnamed"):
+ df = df.drop(columns=[df.columns[0]])
+
+ # Normalize column names and keep only relevant ones
+ df.columns = [c.strip().lower() for c in df.columns]
+ df = df[["isic_id", "patient_id", "target"]]
+
+ # Construct image file paths
+ img_dir_path = self.root / image_dir / "image"
+ df["filepath"] = df["isic_id"].astype(str).apply(lambda x: str(img_dir_path / f"{x}.jpg"))
+
+ # Keep only existing image files
+ df = df[df["filepath"].apply(os.path.exists)].reset_index(drop=True)
+ df["target"] = df["target"].astype(int)
+
+ if len(df) == 0:
+ raise RuntimeError(f"No .jpg images found in {img_dir_path}.")
+ self.df = df
+ print(f"[INFO] Loaded {len(df)} samples from {csv_name}")
+
+
+ def _split_no_group(self, train: float, val: float, seed: int):
+ """Perform stratified split without grouping by patient IDs.
+
+ Args:
+ train (float): Training set fraction.
+ val (float): Validation set fraction.
+ seed (int): Random seed.
+
+ Returns:
+ tuple(pd.DataFrame): (train_df, val_df, test_df)
+ """
+ y = self.df["target"].values
+
+ # First split into train and (val+test)
+ sss = StratifiedShuffleSplit(n_splits=1, train_size=train, random_state=seed)
+ train_idx, temp_idx = next(sss.split(self.df, y))
+ temp = self.df.iloc[temp_idx]
+ y_temp = temp["target"].values
+
+ # Split remaining into validation and test
+ sss2 = StratifiedShuffleSplit(n_splits=1, train_size=val / (1.0 - train), random_state=seed)
+ val_rel, test_rel = next(sss2.split(temp, y_temp))
+ val_idx = temp.index[val_rel]
+ test_idx = temp.index[test_rel]
+ return (
+ self.df.loc[train_idx].reset_index(drop=True),
+ self.df.loc[val_idx].reset_index(drop=True),
+ self.df.loc[test_idx].reset_index(drop=True),
+ )
+
+
+ def _split_with_group(self, train: float, val: float, seed: int):
+ """Perform group-aware split by patient IDs.
+
+ Args:
+ train (float): Training set fraction.
+ val (float): Validation set fraction.
+ seed (int): Random seed.
+
+ Returns:
+ tuple(pd.DataFrame): (train_df, val_df, test_df)
+ """
+ y = self.df["target"].values
+ groups = self.df["patient_id"].astype(str).values
+
+ # Train / temp split using group-level shuffle
+ gss = GroupShuffleSplit(n_splits=1, train_size=train, random_state=seed)
+ train_idx, temp_idx = next(gss.split(self.df, y, groups))
+ temp = self.df.iloc[temp_idx]
+ y_temp = temp["target"].values
+ groups_temp = temp["patient_id"].astype(str).values
+
+ # Split remaining into validation and test
+ gss2 = GroupShuffleSplit(n_splits=1, train_size=val / (1.0 - train), random_state=seed)
+ val_rel, test_rel = next(gss2.split(temp, y_temp, groups_temp))
+ val_idx = temp.index[val_rel]
+ test_idx = temp.index[test_rel]
+ return (
+ self.df.loc[train_idx].reset_index(drop=True),
+ self.df.loc[val_idx].reset_index(drop=True),
+ self.df.loc[test_idx].reset_index(drop=True),
+ )
+
+
+ def split(self, train=TRAIN_FRAC, val=VAL_FRAC, test=TEST_FRAC,
+ use_group: bool = USE_GROUP_SPLIT, seed: int = SEED):
+ """Split the dataset into train, validation, and test sets.
+
+ Args:
+ train (float): Fraction for training set.
+ val (float): Fraction for validation set.
+ test (float): Fraction for test set.
+ use_group (bool): Whether to use group-aware splitting.
+ seed (int): Random seed.
+
+ Returns:
+ tuple(pd.DataFrame): (train_df, val_df, test_df)
+ """
+ assert abs(train + val + test - 1.0) < 1e-6 # sanity check
+ if use_group and "patient_id" in self.df.columns:
+ return self._split_with_group(train, val, seed)
+ return self._split_no_group(train, val, seed)
+
+
+ @staticmethod
+ def balance_1to1(df: pd.DataFrame, seed: int = SEED) -> pd.DataFrame:
+ """Balance dataset to a 1:1 ratio between positive and negative samples.
+
+ Args:
+ df (pd.DataFrame): Input dataframe with 'target' column.
+ seed (int): Random seed.
+
+ Returns:
+ pd.DataFrame: Balanced dataframe.
+ """
+ pos = df[df["target"] == 1]
+ neg = df[df["target"] == 0]
+ if len(pos) == 0 or len(neg) == 0:
+ return df.reset_index(drop=True)
+ if len(pos) < len(neg):
+ neg = neg.sample(n=len(pos), random_state=seed)
+ else:
+ pos = pos.sample(n=len(neg), random_state=seed)
+ out = pd.concat([pos, neg]).sample(frac=1.0, random_state=seed)
+ return out.reset_index(drop=True)
+
+
+class ISICImageDataset(Dataset):
+ """Torch dataset for standard classification mode."""
+
+ def __init__(self, df: pd.DataFrame, transform=None):
+ self.df = df.reset_index(drop=True)
+ self.tfm = transform
+
+ def __len__(self) -> int:
+ """Return number of samples."""
+ return len(self.df)
+
+ def __getitem__(self, i: int):
+ """Load and transform the i-th sample.
+
+ Args:
+ i (int): Sample index.
+
+ Returns:
+ tuple(torch.Tensor, int, int): (image, label, index)
+ """
+ row = self.df.iloc[i]
+ img = Image.open(row["filepath"]).convert("RGB")
+ if self.tfm:
+ img = self.tfm(img)
+ label = int(row["target"])
+ return img, label, i
+
+
+class ISICTripletDataset(Dataset):
+ """Torch dataset for triplet generation (anchor, positive, negative)."""
+
+ def __init__(self, df: pd.DataFrame, transform=None, seed: int = SEED):
+ self.df = df.reset_index(drop=True)
+ self.tfm = transform
+
+ # Index samples by class for easy positive/negative sampling
+ self.by_cls = {
+ 0: self.df[self.df["target"] == 0].index.tolist(),
+ 1: self.df[self.df["target"] == 1].index.tolist(),
+ }
+ random.seed(seed)
+
+ def __len__(self) -> int:
+ return len(self.df)
+
+ def _load(self, idx: int):
+ """Load one image by its index and apply transforms if defined."""
+ path = self.df.iloc[idx]["filepath"]
+ img = Image.open(path).convert("RGB")
+ return self.tfm(img) if self.tfm else img
+
+ def __getitem__(self, i: int):
+ """Return a triplet (anchor, positive, negative, label)."""
+ anc_row = self.df.iloc[i]
+ y = int(anc_row["target"])
+
+ # Pick a positive sample from same class (not itself)
+ same = [j for j in self.by_cls[y] if j != i]
+ pos_idx = random.choice(same) if same else i
+
+ # Pick a negative sample from opposite class
+ neg_idx = random.choice(self.by_cls[1 - y])
+ anc = self._load(i)
+ pos = self._load(pos_idx)
+ neg = self._load(neg_idx)
+ return anc, pos, neg, y
+
+
+def build_transforms(image_size: int = IMAGE_SIZE):
+ """Create image transformations for training and evaluation."""
+ train_tfm = T.Compose([
+ T.RandomHorizontalFlip(p=FLIP_PROB),
+ T.RandomVerticalFlip(p=FLIP_PROB),
+ T.RandomRotation(degrees=ROT_DEG),
+ T.ColorJitter(**COLOR_JITTER),
+ T.ToTensor(),
+ T.Normalize(mean=MEAN, std=STD),
+ ])
+ eval_tfm = T.Compose([
+ T.ToTensor(),
+ T.Normalize(mean=MEAN, std=STD),
+ ])
+ return train_tfm, eval_tfm
+
+
+def get_loaders(
+ dataroot: str = DATAPATH,
+ balance_each_split: bool = True,
+ use_group_split: bool = USE_GROUP_SPLIT,
+ batch_triplet: int = BATCH_TRIPLET,
+ batch_classif: int = BATCH_CLASSIF,
+ num_workers: int = NUM_WORKERS,
+):
+ """Build dataloaders for Siamese and classification training.
+
+ Args:
+ dataroot (str): Root dataset directory.
+ balance_each_split (bool): Whether to balance classes in each split.
+ use_group_split (bool): Whether to use patient-based group splitting.
+ batch_triplet (int): Batch size for triplet dataloader.
+ batch_classif (int): Batch size for classification dataloader.
+ num_workers (int): Number of parallel data-loading workers.
+
+ Returns:
+ dict[str, torch.utils.data.DataLoader]: Dictionary of dataloaders.
+ """
+ table = ISICTable(dataroot, CSV_NAME, IMG_DIR)
+ tr_df, va_df, te_df = table.split(train=TRAIN_FRAC, val=VAL_FRAC, test=TEST_FRAC,
+ use_group=use_group_split, seed=SEED)
+ if balance_each_split:
+ tr_df = ISICTable.balance_1to1(tr_df, seed=SEED)
+ va_df = ISICTable.balance_1to1(va_df, seed=SEED)
+ te_df = ISICTable.balance_1to1(te_df, seed=SEED)
+
+ tfm_train, tfm_eval = build_transforms(image_size=IMAGE_SIZE)
+
+ # Datasets for Siamese training
+ ds_triplet = ISICTripletDataset(tr_df, transform=tfm_train, seed=SEED)
+ dl_triplet = DataLoader(ds_triplet, batch_size=batch_triplet, shuffle=True,
+ num_workers=num_workers, pin_memory=True, drop_last=True)
+
+ ds_val_triplet = ISICTripletDataset(va_df, transform=tfm_eval, seed=SEED)
+ dl_val_triplet = DataLoader(ds_val_triplet, batch_size=batch_triplet, shuffle=False,
+ num_workers=num_workers, pin_memory=True, drop_last=False)
+
+ # Datasets for classification
+ ds_tr_cls = ISICImageDataset(tr_df, transform=tfm_train)
+ ds_va_cls = ISICImageDataset(va_df, transform=tfm_eval)
+ ds_te_cls = ISICImageDataset(te_df, transform=tfm_eval)
+
+ dl_tr_cls = DataLoader(ds_tr_cls, batch_size=batch_classif, shuffle=True,
+ num_workers=num_workers, pin_memory=True)
+ dl_va_cls = DataLoader(ds_va_cls, batch_size=batch_classif, shuffle=False,
+ num_workers=num_workers, pin_memory=True)
+ dl_te_cls = DataLoader(ds_te_cls, batch_size=batch_classif, shuffle=False,
+ num_workers=num_workers, pin_memory=True)
+
+ return {
+ "triplet_train": dl_triplet,
+ "triplet_val": dl_val_triplet,
+ "classif_train": dl_tr_cls,
+ "classif_val": dl_va_cls,
+ "classif_test": dl_te_cls,
+ }
diff --git a/recognition/siamese/images/Siamese Network.webp b/recognition/siamese/images/Siamese Network.webp
new file mode 100644
index 000000000..59f72428e
Binary files /dev/null and b/recognition/siamese/images/Siamese Network.webp differ
diff --git a/recognition/siamese/images/classifier_loss.png b/recognition/siamese/images/classifier_loss.png
new file mode 100644
index 000000000..9fc9a946e
Binary files /dev/null and b/recognition/siamese/images/classifier_loss.png differ
diff --git a/recognition/siamese/images/confusion_matrix.png b/recognition/siamese/images/confusion_matrix.png
new file mode 100644
index 000000000..7f2eaa7dd
Binary files /dev/null and b/recognition/siamese/images/confusion_matrix.png differ
diff --git a/recognition/siamese/images/input_sample.png b/recognition/siamese/images/input_sample.png
new file mode 100644
index 000000000..97a5ee1a8
Binary files /dev/null and b/recognition/siamese/images/input_sample.png differ
diff --git a/recognition/siamese/images/siamese_loss.png b/recognition/siamese/images/siamese_loss.png
new file mode 100644
index 000000000..041cde148
Binary files /dev/null and b/recognition/siamese/images/siamese_loss.png differ
diff --git a/recognition/siamese/modules.py b/recognition/siamese/modules.py
new file mode 100644
index 000000000..eeffaaf60
--- /dev/null
+++ b/recognition/siamese/modules.py
@@ -0,0 +1,100 @@
+# modules.py
+# Siamese network modules: encoder and classifier.
+# Author: s4778251
+
+
+
+import torch
+import torch.nn as nn
+import torchvision.models as models
+from params import OUT_DIM, HIDDEN_DIMS, NEGATIVE_SLOPE, DROPOUT_P
+
+
+class SiameseEncoder(nn.Module):
+ """Feature extraction network for Siamese training.
+
+ This encoder uses a ResNet-50 backbone pretrained on ImageNet and projects the
+ resulting feature vector into a normalized embedding space. It is typically
+ trained using triplet loss to ensure semantically similar samples are close
+ together while dissimilar ones are farther apart.
+ """
+
+ def __init__(self, out_dim=OUT_DIM, pretrained=True):
+ """
+ Args:
+ out_dim (int): Dimensionality of the output embedding vector.
+ pretrained (bool): Whether to initialize the ResNet-50 backbone with
+ ImageNet-pretrained weights.
+ """
+ super().__init__()
+
+ # Load the ResNet50 backbone and remove its final classification layer
+ base = models.resnet50(weights=models.ResNet50_Weights.DEFAULT if pretrained else None)
+ feat_dim = base.fc.in_features
+ base.fc = nn.Identity()
+ self.backbone = base
+
+ # Linear projection to the target embedding dimension
+ self.proj = nn.Linear(feat_dim, out_dim)
+
+ def forward(self, x):
+ """Forward pass through the encoder.
+
+ Args:
+ x (torch.Tensor): Input batch of images with shape (B, 3, H, W).
+
+ Returns:
+ torch.Tensor: L2-normalized embeddings of shape (B, out_dim).
+ """
+ feat = self.backbone(x) # Extract features via ResNet
+ emb = self.proj(feat) # Project into embedding space
+ emb = nn.functional.normalize(emb, p=2, dim=1) # Normalize to unit length
+ return emb
+
+
+class BinaryClassifier(nn.Module):
+ """Four-layer MLP classifier for binary prediction.
+
+ This model takes precomputed embeddings (e.g., from SiameseEncoder) and maps
+ them through a sequence of fully connected layers with LeakyReLU activation
+ and dropout regularization. The output layer produces two logits for binary
+ classification.
+ """
+
+ def __init__(self, in_dim=OUT_DIM, hidden=HIDDEN_DIMS, num_classes=2,
+ negative_slope=NEGATIVE_SLOPE, p=DROPOUT_P):
+ """
+ Args:
+ in_dim (int): Input feature dimension (should match encoder output).
+ hidden (tuple[int]): Sizes of hidden layers.
+ num_classes (int): Number of output classes (2 for binary tasks).
+ negative_slope (float): Slope for LeakyReLU activation.
+ p (float): Dropout probability for regularization.
+ """
+ super().__init__()
+ layers = []
+ last = in_dim
+
+ # Build MLP layers dynamically from hidden size sequence
+ for h in hidden:
+ layers += [
+ nn.Linear(last, h),
+ nn.LeakyReLU(negative_slope=negative_slope, inplace=True),
+ nn.Dropout(p)
+ ]
+ last = h
+
+ # Final classification layer without activation
+ layers += [nn.Linear(last, num_classes)]
+ self.net = nn.Sequential(*layers)
+
+ def forward(self, x):
+ """Forward pass through the classifier.
+
+ Args:
+ x (torch.Tensor): Input feature batch (B, in_dim).
+
+ Returns:
+ torch.Tensor: Output logits (B, num_classes).
+ """
+ return self.net(x)
diff --git a/recognition/siamese/params.py b/recognition/siamese/params.py
new file mode 100644
index 000000000..3816b3646
--- /dev/null
+++ b/recognition/siamese/params.py
@@ -0,0 +1,55 @@
+# params.py
+# Configuration parameters for Siamese network training and evaluation.
+# Author: s4778251
+
+# Dataset and Path Settings
+DATAPATH = "./dataset"
+CSV_NAME = "train-metadata.csv"
+IMG_DIR = "train-image"
+
+MODELPATH = "./models"
+IMAGEPATH = "./images"
+
+# Data Split and Loader
+SEED = 42 # one true number!
+TRAIN_FRAC, VAL_FRAC, TEST_FRAC = 0.7, 0.1, 0.2 # Dataset split ratios
+USE_GROUP_SPLIT = True # Whether to use patient-based group split
+
+BATCH_TRIPLET = 64 # Batch size for Siamese training (triplet loss)
+BATCH_CLASSIF = 64 # Batch size for classifier training
+NUM_WORKERS = 4 # Number of worker threads for data loading
+
+# Image Preprocessing
+MEAN = [0.5, 0.5, 0.5]
+STD = [0.5, 0.5, 0.5]
+IMAGE_SIZE = 256 # Image resize dimension
+ROT_DEG = 15 # Max rotation degree for data augmentation
+FLIP_PROB = 0.5 # Probability of horizontal/vertical flip
+COLOR_JITTER = dict( # color jitter parameters
+ brightness=0.1, contrast=0.1, saturation=0.05, hue=0.02
+)
+
+# Model Settings
+OUT_DIM = 512 # Output embedding dimension of Siamese encoder
+HIDDEN_DIMS = (256, 64) # Hidden layer dimensions for classifier MLP
+NEGATIVE_SLOPE = 0.01 # LeakyReLU slope for classifier
+DROPOUT_P = 0.4 # Dropout probability for classifier layers
+
+# Training Hyperparameters
+TRIPLET_MARGIN = 1.0 # Margin for triplet loss
+EPOCHS_SIAMESE = 100 # Max epochs for Siamese encoder training
+EPOCHS_CLS = 80 # Max epochs for classifier training
+LR_SIAMESE = 0.0001 # Learning rate for Siamese encoder
+LR_CLS = 0.0005 # Learning rate for classifier
+
+# Early Stop / Scheduler
+PATIENCE = 5 # Early stopping patience in epochs
+MIN_DELTA = 0.001 # Minimum improvement threshold for validation loss
+SCHED_FACTOR = 0.5
+SCHED_PATIENCE = 3
+
+# Output Filenames
+SAVE_SAMPLE_NAME = "input_sample.png" # Example image filename
+SIAMESE_LOSS_NAME = "siamese_loss.png" # Siamese loss plot filename
+CLS_LOSS_NAME = "classifier_loss.png" # Classifier loss plot filename
+CM_NAME = "confusion_matrix.png" # Confusion matrix filename
diff --git a/recognition/siamese/predict.py b/recognition/siamese/predict.py
new file mode 100644
index 000000000..28b30488e
--- /dev/null
+++ b/recognition/siamese/predict.py
@@ -0,0 +1,69 @@
+# predct.py
+# Evaluate trained Siamese encoder + classifier on test set.
+# Author: s4778251
+
+
+import os
+import torch
+from sklearn.metrics import confusion_matrix, classification_report
+from dataset import get_loaders
+from modules import SiameseEncoder, BinaryClassifier
+from utils import plot_confusion_matrix, extract_features
+from params import MODELPATH, IMAGEPATH, OUT_DIM, CM_NAME
+
+
+def main():
+ """Evaluate trained Siamese encoder + classifier on the test set.
+
+ This script:
+ 1. Loads trained model weights from disk.
+ 2. Extracts test embeddings using the Siamese encoder.
+ 3. Applies the trained binary classifier to predict classes.
+ 4. Computes accuracy, confusion matrix, and classification report.
+ 5. Saves the confusion matrix as an image.
+
+ The results are printed to the console and saved in IMAGEPATH.
+ """
+ device = "cuda" if torch.cuda.is_available() else "cpu"
+ print("Device:", device)
+
+ # Load dataloaders for evaluation
+ loaders = get_loaders()
+ cls_te = loaders["classif_test"]
+
+ # Initialize models
+ encoder = SiameseEncoder(out_dim=OUT_DIM).to(device)
+ clf = BinaryClassifier(in_dim=OUT_DIM).to(device)
+
+ # Load pretrained weights
+ encoder.load_state_dict(torch.load(os.path.join(MODELPATH, "siamese.pth"), map_location=device))
+ clf.load_state_dict(torch.load(os.path.join(MODELPATH, "classifier.pth"), map_location=device))
+
+ print("[INFO] Extracting test features...")
+
+ # Extract embeddings and labels from test set
+ Xte, yte = extract_features(encoder, cls_te, device)
+
+ # Predict class logits using classifier
+ clf.eval()
+ with torch.no_grad():
+ preds = clf(Xte.to(device)).argmax(1).cpu()
+
+ # Compute evaluation metrics
+ acc = (preds == yte).float().mean().item()
+ cm = confusion_matrix(yte.numpy(), preds.numpy())
+
+ print(f"[TEST] Accuracy: {acc*100:.2f}%")
+ print("[TEST] Confusion Matrix:\n", cm)
+ print("\n[TEST] Classification Report:\n",
+ classification_report(yte.numpy(), preds.numpy(),
+ target_names=["benign(0)", "malignant(1)"]))
+
+ # Save confusion matrix plot
+ plot_confusion_matrix(cm, classes=["Benign", "Malignant"],
+ save_path=os.path.join(IMAGEPATH, CM_NAME))
+ print(f"[INFO] Saved {CM_NAME} to: {IMAGEPATH}")
+
+
+if __name__ == "__main__":
+ main()
diff --git a/recognition/siamese/train.py b/recognition/siamese/train.py
new file mode 100644
index 000000000..3b322fa72
--- /dev/null
+++ b/recognition/siamese/train.py
@@ -0,0 +1,212 @@
+# train.py
+# Train Siamese encoder + binary classifier on ISIC dataset.
+# Author: s4778251
+
+
+from params import (
+ MODELPATH, IMAGEPATH, TRIPLET_MARGIN,
+ LR_SIAMESE, LR_CLS, EPOCHS_SIAMESE, EPOCHS_CLS,
+ PATIENCE, MIN_DELTA, SCHED_FACTOR, SCHED_PATIENCE,
+ SIAMESE_LOSS_NAME, CLS_LOSS_NAME, SAVE_SAMPLE_NAME, OUT_DIM
+)
+from dataset import get_loaders
+from modules import SiameseEncoder, BinaryClassifier
+import os
+import torch
+import torch.nn as nn
+from utils import ensure_dir, plot_lines, save_sample_input, extract_features
+
+
+def train_siamese(encoder, train_loader, val_loader, device):
+ """Train the Siamese encoder using triplet loss.
+
+ Args:
+ encoder (torch.nn.Module): Siamese feature encoder model.
+ train_loader (DataLoader): Training dataloader providing triplets.
+ val_loader (DataLoader): Validation dataloader providing triplets.
+ device (str): Device to perform computation ("cuda" or "cpu").
+
+ Returns:
+ tuple[list[float], list[float]]:
+ Two lists of per-epoch training and validation losses.
+ """
+ encoder.train()
+ opt = torch.optim.Adam(encoder.parameters(), lr=LR_SIAMESE, betas=(0.9, 0.999))
+ criterion = nn.TripletMarginLoss(margin=TRIPLET_MARGIN, p=2)
+
+ tr_hist, va_hist = [], []
+ best_val = float('inf')
+ waited = 0 # patience counter for early stopping
+
+ for epoch in range(EPOCHS_SIAMESE):
+ encoder.train()
+ train_sum = 0.0
+ # Iterate through triplets (anchor, positive, negative, label)
+ for anc, pos, neg, _ in train_loader:
+ anc, pos, neg = anc.to(device), pos.to(device), neg.to(device)
+ za, zp, zn = encoder(anc), encoder(pos), encoder(neg)
+ loss = criterion(za, zp, zn)
+ opt.zero_grad()
+ loss.backward()
+ opt.step()
+ train_sum += loss.item()
+
+ avg_tr = train_sum / max(1, len(train_loader))
+
+ # Validation loop (no gradient computation)
+ encoder.eval()
+ val_sum = 0.0
+ with torch.no_grad():
+ for anc, pos, neg, _ in val_loader:
+ anc, pos, neg = anc.to(device), pos.to(device), neg.to(device)
+ za, zp, zn = encoder(anc), encoder(pos), encoder(neg)
+ vloss = criterion(za, zp, zn)
+ val_sum += vloss.item()
+
+ avg_va = val_sum / max(1, len(val_loader))
+ tr_hist.append(avg_tr)
+ va_hist.append(avg_va)
+
+ print(f"[Siamese] Epoch {epoch+1}/{EPOCHS_SIAMESE} train_loss={avg_tr:.4f} val_loss={avg_va:.4f}")
+
+ # Early stopping logic
+ if avg_va < best_val - MIN_DELTA:
+ best_val = avg_va
+ waited = 0
+ else:
+ waited += 1
+ if waited >= PATIENCE:
+ print(f"[Siamese] Early stopping at epoch {epoch+1}")
+ break
+
+ # Save trained encoder weights
+ torch.save(encoder.state_dict(), os.path.join(MODELPATH, "siamese.pth"))
+ print("[INFO] Saved final Siamese encoder (stopped model).")
+ return tr_hist, va_hist
+
+
+def train_classifier(clf, train_data, val_data, device):
+ """Train the binary classifier on precomputed embeddings.
+
+ Args:
+ clf (torch.nn.Module): Binary classification MLP model.
+ train_data (tuple[Tensor, Tensor]): Training embeddings and labels.
+ val_data (tuple[Tensor, Tensor]): Validation embeddings and labels.
+ device (str): Device ("cuda" or "cpu").
+
+ Returns:
+ tuple[list[float], list[float], list[float]]:
+ Training losses, validation losses, and validation accuracies.
+ """
+ opt = torch.optim.Adam(clf.parameters(), lr=LR_CLS, betas=(0.9, 0.999), weight_decay=5e-4)
+ scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(opt, mode='min', factor=SCHED_FACTOR, patience=SCHED_PATIENCE)
+ criterion = nn.CrossEntropyLoss()
+
+ Xtr, ytr = train_data
+ Xva, yva = val_data
+
+ tr_hist, va_hist, va_acc_hist = [], [], []
+ best_val = float('inf')
+ waited = 0
+
+ for epoch in range(EPOCHS_CLS):
+ clf.train()
+ # Shuffle embeddings before each epoch
+ idx = torch.randperm(len(Xtr))
+ Xb, yb = Xtr[idx].to(device), ytr[idx].to(device)
+ logits = clf(Xb)
+ loss = criterion(logits, yb)
+ opt.zero_grad()
+ loss.backward()
+ opt.step()
+ train_loss = loss.item()
+
+ # Validation phase
+ clf.eval()
+ with torch.no_grad():
+ v_logits = clf(Xva.to(device))
+ val_loss = criterion(v_logits, yva.to(device)).item()
+ val_acc = (v_logits.argmax(1).cpu() == yva).float().mean().item()
+
+ scheduler.step(val_loss)
+ tr_hist.append(train_loss)
+ va_hist.append(val_loss)
+ va_acc_hist.append(val_acc)
+
+ print(f"[CLS] Epoch {epoch+1}/{EPOCHS_CLS} train_loss={train_loss:.4f} val_loss={val_loss:.4f} val_acc={val_acc*100:.2f}%")
+
+ # Early stopping
+ if val_loss < best_val - MIN_DELTA:
+ best_val = val_loss
+ waited = 0
+ else:
+ waited += 1
+ if waited >= PATIENCE:
+ print(f"[CLS] Early stopping at epoch {epoch+1}")
+ break
+
+ # Save classifier weights
+ torch.save(clf.state_dict(), os.path.join(MODELPATH, "classifier.pth"))
+ print("[INFO] Saved final classifier (stopped model).")
+ return tr_hist, va_hist, va_acc_hist
+
+
+def main():
+ """Main training routine for Siamese + classification stages.
+
+ This pipeline:
+ 1. Loads dataset loaders for triplet and classification tasks.
+ 2. Trains the Siamese encoder using triplet loss.
+ 3. Extracts embeddings from the encoder for classification training.
+ 4. Trains the binary classifier using cross-entropy loss.
+ 5. Plots and saves both loss curves and one sample input image.
+ """
+ device = "cuda" if torch.cuda.is_available() else "cpu"
+ print("Device:", device)
+
+ # Prepare data loaders
+ loaders = get_loaders()
+ tri_train = loaders["triplet_train"]
+ tri_val = loaders["triplet_val"]
+ cls_tr = loaders["classif_train"]
+ cls_va = loaders["classif_val"]
+
+ # Ensure output directories exist
+ ensure_dir(MODELPATH)
+ ensure_dir(IMAGEPATH)
+
+ # Train Siamese Encoder
+ encoder = SiameseEncoder(out_dim=OUT_DIM).to(device)
+ siam_tr_hist, siam_va_hist = train_siamese(encoder, tri_train, tri_val, device)
+
+ # Plot Siamese loss curve
+ xs = list(range(1, len(siam_tr_hist) + 1))
+ plot_lines(xs, [siam_tr_hist, siam_va_hist], ["Training", "Validation"],
+ title="Loss of the Siamese Network",
+ xlabel="Epochs", ylabel="Triplet Loss",
+ save_path=os.path.join(IMAGEPATH, SIAMESE_LOSS_NAME))
+
+ # Extract Embeddings
+ print("[INFO] Extracting embeddings...")
+ encoder.eval()
+ Xtr, ytr = extract_features(encoder, cls_tr, device)
+ Xva, yva = extract_features(encoder, cls_va, device)
+
+ # Train Classifier
+ clf = BinaryClassifier(in_dim=OUT_DIM).to(device)
+ cls_tr_hist, cls_va_hist, _ = train_classifier(clf, (Xtr, ytr), (Xva, yva), device)
+
+ # Plot classifier loss curve
+ xs = list(range(1, len(cls_tr_hist) + 1))
+ plot_lines(xs, [cls_tr_hist, cls_va_hist], ["Training", "Validation"],
+ title="Loss of the Binary Classifier",
+ xlabel="Epochs", ylabel="CrossEntropy Loss",
+ save_path=os.path.join(IMAGEPATH, CLS_LOSS_NAME))
+
+ # Save a sample input image for reference
+ save_sample_input(loaders["classif_train"], IMAGEPATH, filename=SAVE_SAMPLE_NAME)
+ print(f"[INFO] Training finished. All results saved to {IMAGEPATH}")
+
+
+if __name__ == "__main__":
+ main()
diff --git a/recognition/siamese/utils.py b/recognition/siamese/utils.py
new file mode 100644
index 000000000..ea06b987d
--- /dev/null
+++ b/recognition/siamese/utils.py
@@ -0,0 +1,151 @@
+# utils.py
+# Utility functions for Siamese network training and evaluation.
+# Includes directory management, plotting, sample saving, and feature extraction.
+# Author: s4778251
+
+import os
+import torch
+import matplotlib.pyplot as plt
+import numpy as np
+import torchvision
+from params import MEAN, STD, SAVE_SAMPLE_NAME
+
+
+def ensure_dir(path):
+ """Create a directory if it does not already exist.
+
+ Args:
+ path (str | None): Directory path to create. If None or empty, nothing is created.
+
+ Notes:
+ This is a safe helper that mirrors `mkdir -p` behavior. It never raises if the
+ directory already exists and does nothing for falsy paths.
+ """
+ if path and not os.path.exists(path):
+ os.makedirs(path, exist_ok=True)
+
+
+def plot_lines(xs, ys_list, labels, title, xlabel, ylabel, save_path):
+ """Plot one or more lines and save the figure to disk.
+
+ Args:
+ xs (Sequence[float | int]): Common x-axis values for all lines.
+ ys_list (Sequence[Sequence[float]]): A list of y-value sequences, one per line.
+ labels (Sequence[str]): Legend labels corresponding to each sequence in ys_list.
+ title (str): Figure title.
+ xlabel (str): X-axis label.
+ ylabel (str): Y-axis label.
+ save_path (str): File path where the plot image will be saved.
+
+ Behavior:
+ Creates a new figure, draws each line in order, adds a legend if labels are given,
+ tightens the layout, ensures the parent directory exists, saves the image, and
+ finally closes the figure to free memory.
+ """
+ plt.figure()
+ for ys, lb in zip(ys_list, labels):
+ plt.plot(xs, ys, label=lb) # one line per series
+ if labels:
+ plt.legend()
+ plt.title(title)
+ plt.xlabel(xlabel)
+ plt.ylabel(ylabel)
+ plt.tight_layout()
+ ensure_dir(os.path.dirname(save_path))
+ plt.savefig(save_path, dpi=200)
+ plt.close() # avoid accumulating open figures
+
+
+def plot_confusion_matrix(cm, classes, save_path):
+ """Visualize and save a confusion matrix as an image.
+
+ Args:
+ cm (np.ndarray): Square confusion matrix of integer counts with shape (C, C).
+ classes (Sequence[str]): Class names for tick labels, length must be C.
+ save_path (str): File path where the plot image will be saved.
+
+ """
+ plt.figure()
+ plt.imshow(cm, interpolation='nearest', aspect='auto')
+ plt.title('Confusion Matrix')
+ plt.colorbar()
+ tick_marks = np.arange(len(classes))
+ plt.xticks(tick_marks, classes, rotation=45)
+ plt.yticks(tick_marks, classes)
+ thresh = cm.max() / 2.0 if cm.size else 0
+ for i in range(cm.shape[0]):
+ for j in range(cm.shape[1]):
+ plt.text(
+ j,
+ i,
+ format(cm[i, j], 'd'),
+ ha="center",
+ va="center",
+ color="white" if cm[i, j] > thresh else "black",
+ )
+ plt.ylabel('True label')
+ plt.xlabel('Predicted label')
+ plt.tight_layout()
+ ensure_dir(os.path.dirname(save_path))
+ plt.savefig(save_path, dpi=200)
+ plt.close()
+
+
+def save_sample_input(dataloader, save_dir, filename=SAVE_SAMPLE_NAME):
+ """Save a single example image from a dataloader to disk for quick inspection.
+
+ Args:
+ dataloader (torch.utils.data.DataLoader): A dataloader that yields (image, label, index).
+ save_dir (str): Directory where the image should be stored.
+ filename (str): File name for the saved image.
+
+ Behavior:
+ Takes the first batch, inverts the normalization using the configured mean and std,
+ converts the first image to HWC layout, clips to the valid range, and saves it as a PNG.
+ """
+ ensure_dir(save_dir)
+ sample_img, sample_label, _ = next(iter(dataloader))
+ img = sample_img[0]
+ # Build an "inverse" normalization to undo the standardization for visualization.
+ inv_norm = torchvision.transforms.Normalize(
+ mean=[-m / s for m, s in zip(MEAN, STD)],
+ std=[1 / s for s in STD],
+ )
+ img_show = inv_norm(img).permute(1, 2, 0).clamp(0, 1)
+ plt.imshow(img_show)
+ plt.title(f"Sample Input (Label: {sample_label[0].item()})")
+ plt.axis("off")
+ path = os.path.join(save_dir, filename)
+ plt.savefig(path, bbox_inches="tight", dpi=200)
+ plt.close()
+
+
+@torch.no_grad()
+def extract_features(encoder, loader, device):
+ """Run a feature encoder over a dataset and collect embeddings and labels.
+
+ Args:
+ encoder (torch.nn.Module): Model that maps images to embedding vectors.
+ loader (torch.utils.data.DataLoader): Dataloader yielding (image, label, index).
+ device (str | torch.device): Device spec to run the encoder on, e.g. "cuda" or "cpu".
+
+ Returns:
+ Tuple[torch.Tensor, torch.Tensor]:
+ A pair (X, y) where X has shape (N, D) of embeddings and y has shape (N,) of labels.
+
+ Notes:
+ The function prints progress every 10 batches and at completion. Gradients are disabled
+ via the torch.no_grad decorator to reduce memory use and increase throughput.
+ """
+ encoder.eval() # ensure batchnorm/dropout layers are in eval mode
+ xs, ys = [], []
+ total = len(loader)
+ for i, (xb, yb, _) in enumerate(loader):
+ feats = encoder(xb.to(device)).cpu() # move inputs to device, bring features back to CPU
+ xs.append(feats)
+ ys.append(yb)
+ if (i + 1) % 10 == 0 or (i + 1) == total:
+ pct = 100.0 * (i + 1) / total
+ print(f"\r[Extract] {pct:5.1f}% complete", end="")
+ print()
+ return torch.cat(xs), torch.cat(ys)