shakes76 · ChiaJouLu · Oct 28, 2025 · Oct 29, 2025 · Oct 29, 2025 · Oct 29, 2025
diff --git a/recognition/UNet_Prostate_47222610/README.md b/recognition/UNet_Prostate_47222610/README.md
@@ -0,0 +1,238 @@
+# 2D Prostate Segmentation using Improved UNet on HipMRI Dataset
+
+**Project**: #3 - HipMRI 2D Segmentation with Improved UNet  
+
+This project implements an **Improved Unet** architecture for automated prostate segmentation from MRI images using the HipMRI Study dataset. The goal is to achieve a Dice similarity coefficient of ≥ 0.75 on the prostate label (Class 3) in the test set.
+
+## Problem Description
+
+Medical image segmentation is crucial for radiotherapy planning in prostate cancer. This project segments four anatomical regions from 2D magnetic resonance imaging (MRI) slices:
+- Class 0: Background
+- Class 1: Body 
+- Class 2: Bone
+- Class 3: **Prostate (Primary Target)**
+
+The Improved UNet architecture enhances the original UNet through architectural improvements.
+
+## Model Architecture
+
+### Improved UNet vs Standard UNet
+The improved UNet incorporates several improvements over the original UNet:
+
+**Key Improvements:**
+1. **Deeper Network**: There are 5 levels of encoding/ decoding in Improved UNet, but only 4 in standard UNet.
+2. **Residual Connections**: Skip connections using residual blocks for better gradient flow.
+3. **Instance Normalization**: More stable than batch normalization for small batch sizes.
+4. **Leaky ReLU**: Prevents the ReLU function from failing on negative slopes (alpha = 0.01).
+5. **Deep Supervision**: Additional loss at intermediate decoder layers.
+6. **Context Module**: Additional context aggregation at bottleneck.
+
+### Network Architecture:
+```
+Input: (N, 1, 256, 128) - Grayscale MRI images
+
+[Encoder Path - Downsampling]
+    Level 0: ResidualDoubleConv: 1 -> 64 channels (256×128)
+        MaxPool2d(2×2)
+    Level 1: ResidualDoubleConv: 64 -> 128 channels (128×64)
+        MaxPool2d(2×2)
+    Level 2: ResidualDoubleConv: 128 -> 256 channels (64×32)
+        MaxPool2d(2×2)
+    Level 3: ResidualDoubleConv: 256 -> 512 channels (32×16)
+        MaxPool2d(2×2)
+    Level 4 (Bottleneck): ResidualDoubleConv: 512 -> 1024 channels (16×8)
+
+[Context Aggregation Module]
+    Parallel dilated convolutions with rates [1, 2, 4, 8]
+    Receptive fields: 3×3, 7×7, 15×15, 31×31
+    Aggregated multi-scale features (1024 channels)
+
+[Decoder Path with Deep Supervision]
+    Level 3: TransposeConv + Skip + ResidualDoubleConv: 1024 -> 512 (32×16)
+        ├─ Auxiliary Output: DSV4 (512 -> 4 classes)
+
+    Level 2: TransposeConv + Skip + ResidualDoubleConv: 512 -> 256 (64×32)
+        ├─ Auxiliary Output: DSV3 (256 -> 4 classes)
+
+    Level 1: TransposeConv + Skip + ResidualDoubleConv: 256 -> 128 (128×64)
+        ├─ Auxiliary Output: DSV2 (128 -> 4 classes)
+
+    Level 0: TransposeConv + Skip + ResidualDoubleConv: 128 -> 64 (256×128)
+        ├─ Auxiliary Output: DSV1 (64 -> 4 classes)
+
+[Output Layer]
+    1×1 Convolution: 64 -> 4 channels
+    Output: (N, 4, 256, 128) - Class logits
+```
+
+### Key Architectural Components
+
+**1. Residual Blocks**
+- Two 3x3 convolutions with skip connections
+- Enables gradient flow in deep networks
+
+**2. Instance Normalization**
+- Normalizes per sample
+- More stable than Batch Normalization for medical imaging
+
+**3. Context Aggregation**
+- Parallel dilated convolutions at bottleneck
+- Captures features at multiple scales (3x3 to 31x31)
+
+**4. Deep Supervision**
+- Auxiliary outputs at 5 decoder levels
+- Loss weights: 1.0, 0.8, 0.6, 0.4, 0.2
+
+## Dataset
+
+**Source**: HipMRI Study on Prostate Cancer 
+
+**Format**: NIfTI (.nii.gz)
+
+**Data Splits**:
+- Training: 11,460 slices
+- Validation: 660 slices
+- Testing: 540 slices
+
+**Preprocessing**:
+1. Load NIFTI files with nibabel
+2. Resize to 256x128
+3. Z-score normalization: '(img - mean) / std'
+4. Clean invalid labels (≥4 -> class 0)
+5. One-hot encode to 4 classes
+
+## Dependencies
+```bash
+torch>=2.0.0
+numpy>=1.24.0
+nibabel>=5.0.0
+matplotlib>=3.7.0
+opencv-python>=4.7.0
+tqdm>=4.65.0
+```
+
+## Project Structure
+```
+UNet_Prostate_47222610/
+├── README.md                 # This file
+├── dataset.py                # Data loading and preprocessing for MRI slices
+├── modules.py                # Improved UNet architecture
+├── predict.py                # Testing and visualization
+├── train.py                  # Training with deep supervision
+└── Result_Images/            # Visualization results
+    ├── training_curves.png
+    ├── prediction_batch_0.png
+    ├── prediction_batch_1.png
+    ├── prediction_batch_2.png
+    ├── prediction_batch_3.png
+    └── prediction_batch_4.png
+```
+
+## Usage
+
+### Training
+
+```bash
+python train.py
+```
+
+Training parameters are hardcoded: 30 epochs, batch size 16, learning rate 1e-4.
+
+### Testing
+
+```bash
+python predict.py
+```
+
+## Training Environment
+
+- **Platform**: Rangpur HPC (The University of Queensland)
+- **GPU**: NVIDIA A100
+- **Training Time**: ~2 hours for 30 epochs
+
+## Reproducibility
+
+### Training Configuration
+
+- Architecture: Improved UNet (5-level encoder/decoder)
+- Epochs: 30
+- Batch size: 16
+- Learning rate: 1e-4 (Adam optimizer)
+- Weight decay: 1e-5 (L2 regularization)
+- Loss function: CrossEntropyLoss + Deep Supervision
+- Image size: 256×128
+- Number of classes: 4
+
+### File Outputs Summary
+
+After training and evaluation:
+```
+UNet_Prostate_47222610/
+├── improved_unet_best.pth           # Best model 
+├── improved_unet_final.pth          # Final model 
+├── improved_unet_epoch_*.pth        # Checkpoints 
+├── logs/
+│   └── improved_unet_*.out          # Training logs (text)
+└── Result_Images/
+    ├── training_curves.png          # Loss/Dice plots
+    └── prediction_batch_*.png       # Sample predictions
+```
+
+## Results
+
+### Test Set Performance
+| Class | Region | Dice |
+|-------|--------|------|
+| 0 | Background | 0.9881 |
+| 1 |  Body | 0.9842 |
+| 2 | Bone | 0.9271 |
+| 3 | **Prostate (Target)** | **0.9552** |
+
+**Project Requirement**: Prostate Dice ≥ 0.75  
+**Achievement**: **0.9552** (Exceeds requirement by 27.4%)  
+**Status**: PASSED
+
+### Visualizations
+
+![Training Curves](Result_Images/training_curves.png)
+
+*Figure 1: Training loss and prostate Dice coefficient over 30 epochs.*
+
+![Sample Predictions](Result_Images/prediction_batch_0.png)
+![Sample Predictions](Result_Images/prediction_batch_1.png)
+![Sample Predictions](Result_Images/prediction_batch_2.png)
+
+*Figure 2: Sample predictions on test set. Left: Input MRI, Center: Ground truth, Right: Model prediction.*
+
+## References
+1. **Isensee, F., Kickingereder, P., Wick, W., Bendszus, M., & Maier-Hein, K. H. (2018)**. "Brain Tumor Segmentation and Radiomics Survival Prediction: Contribution to the BRATS 2017 Challenge." arXiv preprint arXiv:1802.10508.
+
+2. **Ronneberger, O., Fischer, P., & Brox, T. (2015)**. "U-Net: Convolutional Networks for Biomedical Image Segmentation." MICCAI 2015.
+
+3. **Yu, F., & Koltun, V. (2016)**. "Multi-Scale Context Aggregation by Dilated Convolutions." ICLR 2016.
+
+4. **COMP3710 Assignment Specification**. The University of Queensland, 2025.
+
+## Academic Integrity
+- Code written independently following course materials and cited papers
+- AI tools (ChatGPT) were used to assist in understanding and to provide reference material for writing docstrings
+
+## Author
+
+**Student Name**: Chia Jou Lu
+
+**Student ID**: 47222610
+
+**Course**: COMP3710 Pattern Recognition 
+
+**Institution**: The University of Queensland  
+
+**Date**: November 2025
+
+
+
+
+
+
+
+
diff --git a/recognition/UNet_Prostate_47222610/Result_Images/prediction_batch_0.png b/recognition/UNet_Prostate_47222610/Result_Images/prediction_batch_0.png
diff --git a/recognition/UNet_Prostate_47222610/Result_Images/prediction_batch_1.png b/recognition/UNet_Prostate_47222610/Result_Images/prediction_batch_1.png
diff --git a/recognition/UNet_Prostate_47222610/Result_Images/prediction_batch_2.png b/recognition/UNet_Prostate_47222610/Result_Images/prediction_batch_2.png
diff --git a/recognition/UNet_Prostate_47222610/Result_Images/prediction_batch_3.png b/recognition/UNet_Prostate_47222610/Result_Images/prediction_batch_3.png
diff --git a/recognition/UNet_Prostate_47222610/Result_Images/prediction_batch_4.png b/recognition/UNet_Prostate_47222610/Result_Images/prediction_batch_4.png
diff --git a/recognition/UNet_Prostate_47222610/Result_Images/training_curves.png b/recognition/UNet_Prostate_47222610/Result_Images/training_curves.png
diff --git a/recognition/UNet_Prostate_47222610/dataset.py b/recognition/UNet_Prostate_47222610/dataset.py
@@ -0,0 +1,79 @@
+"""
+Dataset Loader for HipMRI Prostate Segmentation
+
+This module provides data loading utilities for the HipMRI prostate MRI dataset
+using ONLY the provided utility functions from the assignment Appendix B.
+
+Author: 47222610
+Date: October 2025
+Assignment: Pattern Recognition Project - 2D Prostate Segmentation
+
+References:
+    - Assignment Appendix B: Provided utility functions
+    - NIfTI file format: https://nifti.nimh.nih.gov/
+    - Nibabel library: https://nipy.org/nibabel/
+"""
+import numpy as np
+import nibabel as nib
+from tqdm import tqdm
+
+def to_channels(arr: np.ndarray, dtype=np.uint8) -> np.ndarray:
+    channels = np.unique(arr)
+    res = np.zeros(arr.shape + (len(channels),), dtype=dtype)
+    for c in channels:
+        c = int(c)
+        res[..., c:c+1][arr == c] = 1
+
+    return res
+
+# load medical image functions
+def load_data_2D(imageNames, normImage=False, categorical=False, 
+                 dtype=np.float32, getAffines=False, early_stop=False):
+    """
+    Load medical image data from names, cases list provided into a list for each.
+
+    This function pre-allocates 4D arrays for conv2d to avoid excessive memory ↘
+    untitled folder usage.
+
+    normImage: bool (normalise the image 0.0 -1.0)
+    early_stop: Stop loading pre-maturely, leaves arrays mostly empty, for quick ↘
+        loading and testing scripts.
+    """
+    affines = []
+
+    # get fixed size
+    num = len(imageNames)
+    first_case = nib.load(imageNames[0]).get_fdata(caching='unchanged')
+    if len(first_case.shape) == 3:
+        first_case = first_case[:, :, 0]
+    if categorical:
+        first_case = to_channels(first_case, dtype=dtype)
+        rows, cols, channels = first_case.shape
+        images = np.zeros((num, rows, cols, channels), dtype=dtype)
+    else:
+        rows, cols = first_case.shape
+        images = np.zeros((num, rows, cols), dtype=dtype)
+
+    for i, inName in enumerate(tqdm(imageNames, desc='Loading images')):
+        niftiImage = nib.load(inName)
+        inImage = niftiImage.get_fdata(caching='unchanged')  
+        affine = niftiImage.affine
+        if len(inImage.shape) == 3:
+            inImage = inImage[:, :, 0]
+        inImage = inImage.astype(dtype)
+        if normImage:
+            inImage = (inImage - inImage.mean()) / (inImage.std() + 1e-8)
+        if categorical:
+            inImage = to_channels(inImage, dtype=dtype)
+            images[i, :, :, :] = inImage
+        else:
+            images[i, :, :] = inImage
+
+        affines.append(affine)
+        if i > 20 and early_stop:
+            break
+
+    if getAffines:
+        return images, affines
+    else:
+        return images