Skip to content

Conversation

@jac0be
Copy link

@jac0be jac0be commented Nov 13, 2025

FLAN-T5 + LoRA Layperson Radiology Summarisation

This PR adds my implementation for Project 13: fine-tuning FLAN-T5-base with LoRA to translate expert radiology reports into layperson-friendly summaries using the BioLaySumm 2025 dataset.

What's included

  • modules.py — FLAN-T5 + LoRA model setup
  • dataset.py — dataset loader with optional 90:10 train/test split
  • train.py — full training loop, logging, ROUGE evaluation and model checkpointing
  • predict.py — inference script for arbitrary reports
  • eval.py — builds CSVs and plots from the training logs
  • README.md — detailed description of the approach, results, metrics, and error analysis

Notes

  • All code runs on a single GPU (tested on an RTX 5070 Ti).
  • Final model achieves strong ROUGE-Lsum performance on the validation set.
  • README includes dependency list, usage instructions, plots, and error analysis examples.

Please let me know if there's any changes you'd prefer. I'm more than happy to make them.

jac0be and others added 28 commits November 5, 2025 18:11
…rently no training scripts are implemented.
…as other training details to train_report.json in run dir.
Loss and rouge histories get saved to json files as training runs, in case we need to recreate the plots. Also minor comment clean ups.
…n scripts for dataset.py and predict.py which acts as a chat bot.
@jac0be jac0be marked this pull request as ready for review November 13, 2025 13:16
@jac0be jac0be changed the title Flan T5 Fine-Tuning on BioSummDataset Flan T5 Fine-Tuning on BioSummDataset S45893623 Nov 13, 2025
@gayanku gayanku added the _LLM T5, FLAN-T5, GPT-2 label Nov 23, 2025
@gayanku
Copy link
Collaborator

gayanku commented Nov 24, 2025

Marking

Good/OK/Fair Practice (Design/Commenting, TF/Torch Usage)
Good design and implementation.
Spacing and comments.
No Header blocks. -1
Recognition Problem
Good solution to problem.
Driver Script present.
File structure present.
Good Usage & Demo & Visualisation & Data usage.
Module present.
Commenting present.
No Data leakage found.
Difficulty : Hard. Hard Difficulty : LLM
Commit Log
Good Meaningful commit messages.
Good Progressive commits.
Documentation
Readme :Good.
Model/technical explanation :Good.
Description and Comments :Good.
Markdown used. PDF NOT submitted. -2
Pull Request
Successful Pull Request (Working Algorithm Delivered on Time in Correct Branch).
No Feedback required.
Request Description is good.
TOTAL-3

Marked as per the due date and changes after which aren't necessarily allowed to contribute to grade for fairness.
Subject to approval from Shakes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants