Flan T5 Fine-Tuning on BioSummDataset S45893623 #294

jac0be · 2025-11-13T13:12:49Z

FLAN-T5 + LoRA Layperson Radiology Summarisation

This PR adds my implementation for Project 13: fine-tuning FLAN-T5-base with LoRA to translate expert radiology reports into layperson-friendly summaries using the BioLaySumm 2025 dataset.

What's included

modules.py — FLAN-T5 + LoRA model setup
dataset.py — dataset loader with optional 90:10 train/test split
train.py — full training loop, logging, ROUGE evaluation and model checkpointing
predict.py — inference script for arbitrary reports
eval.py — builds CSVs and plots from the training logs
README.md — detailed description of the approach, results, metrics, and error analysis

Notes

All code runs on a single GPU (tested on an RTX 5070 Ti).
Final model achieves strong ROUGE-Lsum performance on the validation set.
README includes dependency list, usage instructions, plots, and error analysis examples.

Please let me know if there's any changes you'd prefer. I'm more than happy to make them.

…rently no training scripts are implemented.

…luation metrics and logging.

…ounts were attributed to my commits.

… model.

…evice mismatch issues.

…as other training details to train_report.json in run dir.

Loss and rouge histories get saved to json files as training runs, in case we need to recreate the plots. Also minor comment clean ups.

…arise arbitrary reports.

…n scripts for dataset.py and predict.py which acts as a chat bot.

… and generating a summary from the dataset.

gayanku · 2025-11-24T00:32:41Z

Marking

Good/OK/Fair Practice (Design/Commenting, TF/Torch Usage)
	Good design and implementation.
	Spacing and comments.
	No Header blocks.	-1
Recognition Problem
	Good solution to problem.
	Driver Script present.
	File structure present.
	Good Usage & Demo & Visualisation & Data usage.
	Module present.
	Commenting present.
	No Data leakage found.
	Difficulty : Hard. Hard Difficulty : LLM
Commit Log
	Good Meaningful commit messages.
	Good Progressive commits.
Documentation
	Readme :Good.
	Model/technical explanation :Good.
	Description and Comments :Good.
	Markdown used. PDF NOT submitted.	-2
Pull Request
	Successful Pull Request (Working Algorithm Delivered on Time in Correct Branch).
	No Feedback required.
	Request Description is good.
TOTAL		-3

Marked as per the due date and changes after which aren't necessarily allowed to contribute to grade for fairness.
Subject to approval from Shakes

jac0be and others added 28 commits November 5, 2025 18:11

Initial commit.

28d8a15

Retrieved dataset.

bb7d523

Implemented initial dataset handler.

3c6ddaa

Implemented flan-t5-large using a basic health check prompt.

5dc857c

Implemented layperson summary generation on 1st element in the dataset.

a542579

Added assignment structure, changed prompt and model to t5-large. Cur…

1239674

…rently no training scripts are implemented.

Made a fully functioning train.py. Currently lacks complete rouge eva…

210a10b

…luation metrics and logging.

Fixed repo structure. Also fixed issue in github where two jac0be acc…

df166ca

…ounts were attributed to my commits.

Added modules.py, which currently just builds and returns the base t5…

0d16dec

… model.

Changed train.py to use the model generation in modules.py

d1df701

Fixed a bug where moving the model to cuda inside the helper caused d…

cd32402

…evice mismatch issues.

Added held-out test split and saved ROUGE metrics to runs dir.

7a56000

Removed unused imports.

75467dd

Added logging of: parameter counts, GPU info, training time, as well …

06acccb

…as other training details to train_report.json in run dir.

Added json logging for loss / val metrics. Also CSV/plot outputs.

b02f715

Loss and rouge histories get saved to json files as training runs, in case we need to recreate the plots. Also minor comment clean ups.

Refactored csv + plot generation into a seperate helper function.

88b2524

Added a simple predict.py that uses the best saved checkpoint to summ…

fafd8c2

…arise arbitrary reports.

Added an interactive chat interface for predict.py

c3f485f

Made eval.py and moved the evaluation to post-training. Also made mai…

8b51fe3

…n scripts for dataset.py and predict.py which acts as a chat bot.

Optional hold out test-split, used for hyperparameter tuning.

655029e

Added report indexing to predict.py, which allows specifying an index…

1dfcc8b

… and generating a summary from the dataset.

Added more explanatory comments for train.py

5f7943c

Restructured repository in preparation of pull request.

6b6a992

Added a starting report / README.md

9ef1a54

Added a concise background knowledge section to README.

86014f9

Finished the README report.

bd6cdeb

Added requirements.txt. Tested and working on WSL.

0a1e5c4

Final code cleanup ahead of PR.

5eda2b4

jac0be marked this pull request as ready for review November 13, 2025 13:16

jac0be changed the title ~~Flan T5 Fine-Tuning on BioSummDataset~~ Flan T5 Fine-Tuning on BioSummDataset S45893623 Nov 13, 2025

jac0be added 3 commits November 13, 2025 23:32

Corrected image paths in README following folder renaming.

1296b1a

Updated requirements.txt to include rouge score dependencies.

f9b6c40

Minor changes to README.md to tighten reasoning.

8da83c4

gayanku added the _LLM T5, FLAN-T5, GPT-2 label Nov 23, 2025

wangzhaomxy added _Hard No_uploaded_PDF labels Nov 24, 2025

gayanku added the SECOND_MARK label Nov 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flan T5 Fine-Tuning on BioSummDataset S45893623 #294

Flan T5 Fine-Tuning on BioSummDataset S45893623 #294

Uh oh!

jac0be commented Nov 13, 2025 •

edited

Loading

Uh oh!

gayanku commented Nov 24, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Flan T5 Fine-Tuning on BioSummDataset S45893623 #294

Are you sure you want to change the base?

Flan T5 Fine-Tuning on BioSummDataset S45893623 #294

Uh oh!

Conversation

jac0be commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

FLAN-T5 + LoRA Layperson Radiology Summarisation

What's included

Notes

Uh oh!

gayanku commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Marking

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jac0be commented Nov 13, 2025 •

edited

Loading

gayanku commented Nov 24, 2025 •

edited

Loading