Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
ed7c63f
Added project folder Alzheimers_Classifier_s4693608 for Task 8 with i…
CHAR1VAR1 Oct 2, 2025
2eb58b6
Added initial Python files (modules.py, dataset.py, train.py, predict…
CHAR1VAR1 Oct 3, 2025
b4d4f98
Added dataset.py skeleton
CHAR1VAR1 Oct 3, 2025
b745cd8
Implemented image collection and DataLoader helper function
CHAR1VAR1 Oct 3, 2025
04727ad
Tested dataset loader and fixed errors
CHAR1VAR1 Oct 4, 2025
9b06cc5
Added ConvNeXt classifier skeleton
CHAR1VAR1 Oct 4, 2025
77d501a
Implemented ConvNeXt model loading with pretrained weights
CHAR1VAR1 Oct 4, 2025
c1b6d02
Tested forward pass with dummy data and fixed errors
CHAR1VAR1 Oct 4, 2025
defb860
Added train.py skeleton
CHAR1VAR1 Oct 4, 2025
fd25c4e
Implemented training loop
CHAR1VAR1 Oct 4, 2025
c9b8ffe
Added validation loop and accuracy calculation
CHAR1VAR1 Oct 6, 2025
caefe5f
Added main loop
CHAR1VAR1 Oct 6, 2025
4ac5cdd
Tested and fixed errors
CHAR1VAR1 Oct 6, 2025
ebfc444
Added predict.py skeleton
CHAR1VAR1 Oct 6, 2025
4000188
Implemented single image processing for testing
CHAR1VAR1 Oct 6, 2025
c01b060
Implemented evaluation of whole folders of AD/NC images
CHAR1VAR1 Oct 7, 2025
44feed4
Added data augmentation transforms to training set and updated datase…
CHAR1VAR1 Oct 8, 2025
5194892
Added learning rate scheduler
CHAR1VAR1 Oct 8, 2025
9f0f202
Added early stopping mechanism
CHAR1VAR1 Oct 8, 2025
3b4ab7f
Added droput layer to ConvNeXt head and switched to base ConvNeXt model
CHAR1VAR1 Oct 8, 2025
8a3c32a
Tested and fixed errors
CHAR1VAR1 Oct 8, 2025
014bae4
Optimised training
CHAR1VAR1 Oct 10, 2025
941d09d
Improved training accuracy
CHAR1VAR1 Oct 19, 2025
6697604
Changed to predict overall patient case instead of each individual MR…
CHAR1VAR1 Oct 19, 2025
ed85eee
Fixed typo error
CHAR1VAR1 Oct 19, 2025
85ab6ce
Fixed up docstrings
CHAR1VAR1 Oct 19, 2025
ca45e4b
Removed unused import
CHAR1VAR1 Oct 19, 2025
f44519b
Changed output display formatting
CHAR1VAR1 Oct 19, 2025
a9d2bbb
Fixed up docstrings
CHAR1VAR1 Oct 19, 2025
01c8fc7
Changed output display formatting
CHAR1VAR1 Oct 19, 2025
a7e99ef
Filled out details of project in README
CHAR1VAR1 Oct 19, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions recognition/Alzheimers_Classifier_s4693608/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Alzheimer’s Disease MRI Classifier — COMP3710 Task 8

This project implements a deep learning classifier to distinguish between Alzheimer’s Disease (AD) and Normal Control (NC) using 2D MRI brain slices from the ADNI dataset.
The model is based on ConvNeXt, trained using PyTorch, and makes predictions on the slice-level and then aggregates through these predictions and averages them to make patient-level predictions.
The model was trained and tested on UQ's rangpur, achieving a final patient prediction accuracy of 80.22%.

Developed for COMP3710 — Pattern Recognition and Analysis, University of Queensland by Christian Vever - s4693608.

## Dataset Structure

/home/groups/comp3710/ADNI/
├── AD_NC/
│ ├── train/
│ │ ├── AD/
│ │ │ ├── `<patientID>_<slice>.jpeg`
│ │ └── NC/
│ │ ├── `<patientID>_<slice>.jpeg`
│ ├── test/
│ │ ├── AD/
│ │ └── NC/
└── meta_data_with_label.json

All images are greyscale JPEGs of size (256 x 240), and are 2D slices of MRI brain scans form the ADNI dataset.
Filenames follow `<patientID>_<slice>.jpeg` naming convention.

## Requirements

### System Requirments

* Python 3.10 +
* Parallel processing using GPU (optional but recommended)

### Dependency Requirements

* torch
* torchvision
* timm
* pillow
* numpy

## Training Model Parameters

* Model: ConvNeXt (pretrained on ImageNet)
* Epochs: 30
* Batch Size: 32 (optimal tested number of batches)
* Learning Rate: 1e-4
* Scheduler: OneCycleLR (allows for increased learnin rate over first few epochs before decreasing)
* Dropout: Enabled in classifier head (set to 0.5; prevents overfitting)
* Loss: Weighted CrossEntropyLoss (weigth = [2.0, 1.0] to focus more on AD cases)
* Augmentation: Random rotation +/- 10 degrees and horizontal flip (to force the classifier to focus more on general features)
* Normalisation: mean = 0.1159, std = 0.2199 (the calculated mean and std of the dataset)

The script saves the best checkpoint to `best_model.pth`.

## Running Scripts and Outputs

Scripts were run on rangpur using sbatcb.
sbatch runner for train.py:
```
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --gres=gpu:1
#SBATCH --partition=a100
#SBATCH --job-name=train
#SBATCH --time=04:00:00
#SBATCH -o train.out
#SBATCH -e train.err

conda activate torch
python train.py
```

The train.py script outputs (per epoch):
* The current epoch number (out of total epochs)
* The training loss for this epoch
* The validation loss for this epoch
* The validation accuracy for this epoch
* The current learning rate
Example output:
```
Epoch 7/30 | Train Loss: 0.0914 | Val Loss: 0.6345 | Val Acc: 0.7551 | LR: 0.000447
```
Once all epochs have been run, the script outputs the best validation accuracy.

sbatch runner for predict.py:
```
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --gres=gpu:a100
#SBATCH --job-name=predict
#SBATCH -o predict.out
#SBATCH -e predict.err

conda activate torch
python predict.py
```

The predict.py script outputs:
* The patiend id
* The predicted case (AD or NC)
* The true case (AD or NC)
* The patient prediction accuracy across all slices
Example output:
```
Patient 389298: Predicted AD (99.99%) | True: AD
```
Once all patient id's have been tested, the script outputs the overall patient prediction accuracy.
85 changes: 85 additions & 0 deletions recognition/Alzheimers_Classifier_s4693608/dataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
import os
import torch
from torch.utils.data import Dataset, DataLoader
from PIL import Image
import torchvision.transforms as transforms

class ADNIDataset(Dataset):
"""
A PyTorch Dataset for loading 2D MRI image slices from the ADNI dataset,
organized into Alzheimer's disease (AD) and normal control (NC) categories.

Args:
root_dir (str): Path to the AD_NC directory containing "train" and "test".
split (str): Which dataset split to use, "train" or "test".
transform (callable, optional): Optional transform to be applied on an image.

Attributes:
samples (list): List of tuples (image_path, label) for all samples
in the specified split.
"""

def __init__(self, root_dir, split="train", transform=None):
self.root_dir = os.path.join(root_dir, split) # path automatically goes to train folder
self.transform = transform
self.samples = [] # (image_path, label)

# get all (image_path, label) pairs
for label_name, label in [("AD", 1), ("NC", 0)]:
class_dir = os.path.join(self.root_dir, label_name)
for fname in os.listdir(class_dir):
self.samples.append((os.path.join(class_dir, fname), label))

def __len__(self):
return len(self.samples)

def __getitem__(self, idx):
img_path, label = self.samples[idx]
image = Image.open(img_path)

if self.transform:
image = self.transform(image)

return image, torch.tensor(label, dtype=torch.long)

def get_dataloaders(root_dir, batch_size=16):
"""
Create PyTorch DataLoaders for the ADNI dataset.

This function initializes ADNIDataset instances for the training and
testing splits, applies preprocessing transforms (resize, tensor conversion,
normalisation), and returns DataLoaders for batched access.

Args:
root_dir (str): Path to the AD_NC directory containing 'train' and 'test'.
batch_size (int, optional): Number of samples per batch. Default is 16.

Returns:
tuple:
- train_loader (DataLoader): DataLoader for the training set,
with shuffling enabled.
- test_loader (DataLoader): DataLoader for the test set,
with shuffling disabled.
"""
train_transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.RandomHorizontalFlip(p=0.5),
transforms.RandomRotation(degrees=10),
transforms.RandomResizedCrop(224, scale=(0.8, 1.0)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.1159], std=[0.2199])
])

test_transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.1159], std=[0.2199])
])

train_dataset = ADNIDataset(os.path.join(root_dir, "AD_NC"), split="train", transform=train_transform)
test_dataset = ADNIDataset(os.path.join(root_dir, "AD_NC"), split="test", transform=test_transform)

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

return train_loader, test_loader
45 changes: 45 additions & 0 deletions recognition/Alzheimers_Classifier_s4693608/modules.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
import torch
import torch.nn as nn
import timm

class AlzheimersClassifier(nn.Module):
"""
ConvNeXt-based classifier for Alzheimer's (AD) vs Normal Control (NC).

This class wraps a pretrained ConvNeXt model from the `timm` library and modifies it to:
- Accept grayscale input images (1 channel) by duplicating them into 3 channels,
since ConvNeXt expects RGB input.
- Replace the final classification head with a dropout rate and linear layer outputting 2 classes
(Alzheimer's disease = 1, Normal Control = 0).

Attributes:
model : timm.models.ConvNeXt
The ConvNeXt backbone model with a modified classifier head.

Methods:
forward(x: torch.Tensor) -> torch.Tensor
Performs a forward pass through the network.
Takes grayscale input of shape [B, 1, H, W], duplicates channels,
and outputs logits of shape [B, 2].
"""
def __init__(self, model_name="convnext_base", num_classes=2, pretrained=True, dropout=0.5):
super().__init__()
# Load pretrained ConvNeXt backbone
self.model = timm.create_model(model_name, pretrained=pretrained, num_classes=0)

# Ensure fixed size feature vector
self.pool = nn.AdaptiveAvgPool2d(1)

# Replace classifier with dropout + linear
self.dropout = nn.Dropout(dropout)
self.fc = nn.Linear(self.model.num_features, num_classes)

def forward(self, x):
# duplicate x channels ([B, 1, H, W] -> [B, 3, H, W])
x = x.repeat(1, 3, 1, 1)
x = self.model.forward_features(x)
x = self.pool(x)
x = torch.flatten(x, 1)
x = self.dropout(x)
x = self.fc(x)
return x
111 changes: 111 additions & 0 deletions recognition/Alzheimers_Classifier_s4693608/predict.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
import os
import torch
from torchvision import transforms
from PIL import Image
from modules import AlzheimersClassifier
from collections import defaultdict
import numpy as np

def load_model(model_path="best_model.pth"):
"""
This function initializes an instance of the AlzheimersClassifier model,
loads the trained weights from the specified checkpoint file, and sets
the model to evaluation mode on the appropriate device (CPU or GPU).

Args:
model_path (str, optional): Path to the saved model checkpoint (.pth file).
Defaults to the value of "best_model.pth".

Returns:
AlzheimersClassifier:
A ConvNeXt-based PyTorch model ready for inference.
"""
model = AlzheimersClassifier()
model.load_state_dict(torch.load(model_path, map_location="cuda"))
model.to("cuda")
model.eval()
return model

def predict_slice(image_path, model):
"""
The function loads a grayscale MRI slice, applies the same preprocessing
transformations used during training (resize, tensor conversion, normalization),
and performs a forward pass through the trained model to obtain the predicted
probability of AD/NC case.

Args:
image_path (str): Path to the input image (e.g., .jpeg slice).
model (torch.nn.Module): Trained AlzheimersClassifier model instance.

Returns:
probs (float): probability of AD/NC case.
"""
image = Image.open(image_path)

# Apply transform to image
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.1159], std=[0.2199])
])
image = transform(image).unsqueeze(0).to("cuda")

with torch.no_grad():
outputs = model(image)
probs = torch.softmax(outputs, dim=1)[0].cpu().numpy()

return probs

def aggregate_patient_predictions(patient_probs):
"""
Given a list of [NC_prob, AD_prob] arrays for one patient,
average them and return predicted label + confidence.

Args:
patient_probs (list): probabilities of AD case for all slices of one patient.

Returns:
tuple:
- label (str): Predicted class label value (1 for "AD" or 0 for "NC").
- confidence (float): Model confidence for the predicted class,
between 0.0 and 1.0.
- mean_probs (float): mean probability of AD case across aggregated slices
for a patient.
"""
mean_probs = np.mean(patient_probs, axis=0)
label = np.argmax(mean_probs)
confidence = mean_probs[label]

return label, confidence, mean_probs

if __name__ == "__main__":
model = load_model()
root_dir = "/home/groups/comp3710/ADNI/AD_NC/test"
class_names = ["NC", "AD"]

# Group all slices by patient id
patient_slices = defaultdict(list)
for cls in ["AD", "NC"]:
folder = os.path.join(root_dir, cls)
for fname in os.listdir(folder):
patient_id = fname.split('_')[0]
patient_slices[patient_id].append(os.path.join(folder, fname))

# Predict each slice and aggregate
patient_results = {}
for pid, slice_paths in patient_slices.items():
slice_probs = [predict_slice(p, model) for p in slice_paths]
label, conf, mean_probs = aggregate_patient_predictions(slice_probs)
patient_results[pid] = (label, conf, mean_probs)

# Evaluate patient accuracy
correct, total = 0, 0
for pid, (label, conf, probs) in patient_results.items():
true_label = 1 if any("AD/" in p for p in patient_slices[pid]) else 0
total += 1
correct += int(label == true_label)

print(f"Patient {pid}: Predicted {class_names[label]} ({conf * 100:.2f}%)"
f" | True: {class_names[true_label]}")

print(f"\nPatient Prediction Accuracy: {100 * correct / total:.2f}%")
Loading