FLAN-T5 LoRA — Lay Radiology Summarization (BioLaySumm Subtask 2.1) - 47451773 #270

TPGCIG · 2025-11-02T16:55:13Z

Author: Tristan Green (s4745177)
Target branch: topic-recognition
Project folder: recognition/Project13-TristanGreen

Summary

This PR contributes a parameter-efficient fine-tuning of FLAN-T5-base with LoRA for layperson summarization of radiology reports. It includes:

Reproducible training (train.py), evaluation and generation (predict.py and chat.py)
Modular components (modules.py) and documented configuration.
Plots + metrics (loss curve, validation ROUGE trajectory, test bar chart).
A comprehensive README.md.

The algorithm trains on the provided training set, validates on a held-out split for model selection, and reports ROUGE-1/2/L/Lsum on a held-out test set.

Recognition problem & difficulty

Task: Expert→lay summarisation (long-form seq2seq, automatic evaluation via ROUGE)

Setup and Run Script

Setup and running is highlighted explicitly in the README.md, however, the script to run the code with default parameters is:

python train.py

Thank you,
Tristan (s4745177)

…and finished make_datasets

…uler and warmup but produces results.

…gon and summarise.

…ights

…out.

…-2025 into topic-recognition

…e been updated.

…tical to functionality, makes use of tool easier.

….py parameters.

…e it is unnecessary

…eps.

…ons and major loops

…train

Claire1217 · 2025-11-20T09:54:08Z

Recognition Problem : total : 17
Solves problem: Overall good work. Some problems listed here: Files in 5) outputs doesn't align with what you acturally have in the repo. Why use a chat interface? What's the difference between chat and inference? Why you use LoRA? How much space used? Label masking to -100 is what's usually done in gpt2 tokenizer not in T5 tokenizer. "r=8 is a common sweet spot for T5‑base" How do you know? Have you tried different r? (3)
Implementation functions : The code seems to be functional(2)
Good design: good (1)
Commenting: Well commented (1)
Difficulty: Hard (10)

gayanku · 2025-11-24T11:19:43Z

Marking

Good/OK/Fair Practice (Design/Commenting, TF/Torch Usage)
	Good design and implementation.
	Spacing and comments.
	No Header blocks.	-1
Recognition Problem
	OK solution to problem.	-1
	Driver Script present.
	File structure present.
	Good Usage & Demo & Visualisation & Data usage.
	Module present.
	Commenting present.
	No Data leakage found.
	Difficulty : Hard. Hard Difficulty : LLM
Commit Log
	Good Meaningful commit messages.
	Good Progressive commits.
Documentation
	Readme :Acceptable.	-2
	Model/technical explanation :Good.
	Description and Comments :Good.
	Markdown used and PDF submitted.
Pull Request
	Successful Pull Request (Working Algorithm Delivered on Time in Correct Branch).
	No Feedback required.
	Request Description is good.
TOTAL		-4

Marked as per the due date and changes after which aren't necessarily allowed to contribute to grade for fairness.
Subject to approval from Shakes

TPGCIG · 2025-11-24T11:22:57Z

I just want to add that I had an approved extension, I emailed this to Shakes. All elements of the PR were submitted in time of the extension.

TPGCIG and others added 30 commits October 10, 2025 14:32

Implemented dataset with a summariser to connect expert-layman terms …

66a0e8b

…and finished make_datasets

added training data and rouge values to README.md

8c5ba34

Train working for seq2seq on the flan-t5 model. Currently lacks sched…

9e42f22

…uler and warmup but produces results.

Implemented tokenizer and pretrained flan-t5 from HF in load_base_model.

cf9bc45

Implemented basic predict.py with ability to use weights to parse jar…

cd5a99c

…gon and summarise.

Implemented text generation helper fully, finalised modules for now.

74ff2a3

Finished prediction to support single and multi-line json file inputs.

a8cffee

Implemented a chatbot functionality for testing the predict.py and we…

fc0117a

…ights

added gitignore

1c6f8e3

Edited README to add title

6b9515b

README: added headings for topic but yet to fill.

6c25f29

work on README. added overview, motivation, features and codebase lay…

1062c94

…out.

added requirements.txt for training model.

3b713ee

uploaded icon image for readme

b57e1f7

moved brain image

592e0e0

Remove duplicate image from Project folder

3dfc029

renamed image file braint5.png

9f22ed2

Update README with project details and logo

3792b1c

Added training usage to README

db5d0a2

fixed small typo in README

3c177a0

Added table of contents and touched up README

4e35368

added sample usage to README.md

3f03b7d

added images for T5 explanation

3e30894

added lots of detail to README

a2286c1

removed a command line argument from train.py which was unnecessary

5c106cf

Merge branch 'topic-recognition' of github.com:TPGCIG/PatternAnalysis…

cf72462

…-2025 into topic-recognition

added command line arg option to choose which weights you want to use.

db25202

robustness changes and editing tool usage a bit as some functions hav…

b04e1fb

…e been updated.

truncating a few parameter options that are unintuitive and arent cri…

d7e9192

…tical to functionality, makes use of tool easier.

small changes to align functionality of database tools with new train…

d5fbddf

….py parameters.

TPGCIG added 12 commits November 2, 2025 19:01

added training data and graphs for README.md

3edc346

added dataset section to README

54f254d

added explanation for ROUGE scores and removed table of contents sinc…

9c218ff

…e it is unnecessary

fixed some unnecessary comments to improve clarity

bd9a93b

testing with another computer, fixed requirements and installation st…

aec136f

…eps.

added instructions on how to use chat.py

239dbfb

added comment headers to each file to clarify the functionality

7b3ea6b

small edits discussing data split functionalities

7f4bd7d

added information on how to use venv in installation

4363c5d

added comments around codebase to improve clarity to different functi…

61bef03

…ons and major loops

slight edits to remove unnecessary command line args from README and …

032ce58

…train

typo in train.py, added in justification for parameters in README

22171a7

hanemma7moud added _Hard _LLM T5, FLAN-T5, GPT-2 labels Nov 10, 2025

wangzhaomxy added the Uploaded_PDF label Nov 24, 2025

gayanku added the SECOND_MARK label Nov 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FLAN-T5 LoRA — Lay Radiology Summarization (BioLaySumm Subtask 2.1) - 47451773 #270

FLAN-T5 LoRA — Lay Radiology Summarization (BioLaySumm Subtask 2.1) - 47451773 #270

Uh oh!

TPGCIG commented Nov 2, 2025

Uh oh!

Claire1217 commented Nov 20, 2025 •

edited

Loading

Uh oh!

gayanku commented Nov 24, 2025

Uh oh!

TPGCIG commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

FLAN-T5 LoRA — Lay Radiology Summarization (BioLaySumm Subtask 2.1) - 47451773 #270

Are you sure you want to change the base?

FLAN-T5 LoRA — Lay Radiology Summarization (BioLaySumm Subtask 2.1) - 47451773 #270

Uh oh!

Conversation

TPGCIG commented Nov 2, 2025

Summary

Recognition problem & difficulty

Setup and Run Script

Uh oh!

Claire1217 commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gayanku commented Nov 24, 2025

Marking

Uh oh!

TPGCIG commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Claire1217 commented Nov 20, 2025 •

edited

Loading