Skip to content

Conversation

@TPGCIG
Copy link

@TPGCIG TPGCIG commented Nov 2, 2025

Author: Tristan Green (s4745177)
Target branch: topic-recognition
Project folder: recognition/Project13-TristanGreen

Summary

This PR contributes a parameter-efficient fine-tuning of FLAN-T5-base with LoRA for layperson summarization of radiology reports. It includes:

  • Reproducible training (train.py), evaluation and generation (predict.py and chat.py)

  • Modular components (modules.py) and documented configuration.

  • Plots + metrics (loss curve, validation ROUGE trajectory, test bar chart).

  • A comprehensive README.md.

The algorithm trains on the provided training set, validates on a held-out split for model selection, and reports ROUGE-1/2/L/Lsum on a held-out test set.

Recognition problem & difficulty

Task: Expert→lay summarisation (long-form seq2seq, automatic evaluation via ROUGE)

Setup and Run Script

Setup and running is highlighted explicitly in the README.md, however, the script to run the code with default parameters is:

python train.py

Thank you,
Tristan (s4745177)

TPGCIG and others added 30 commits October 10, 2025 14:32
…tical to functionality, makes use of tool easier.
@hanemma7moud hanemma7moud added _Hard _LLM T5, FLAN-T5, GPT-2 labels Nov 10, 2025
@Claire1217
Copy link

Claire1217 commented Nov 20, 2025

Recognition Problem : total : 17
Solves problem: Overall good work. Some problems listed here: Files in 5) outputs doesn't align with what you acturally have in the repo. Why use a chat interface? What's the difference between chat and inference? Why you use LoRA? How much space used? Label masking to -100 is what's usually done in gpt2 tokenizer not in T5 tokenizer. "r=8 is a common sweet spot for T5‑base" How do you know? Have you tried different r? (3)
Implementation functions : The code seems to be functional(2)
Good design: good (1)
Commenting: Well commented (1)
Difficulty: Hard (10)

@gayanku
Copy link
Collaborator

gayanku commented Nov 24, 2025

Marking

Good/OK/Fair Practice (Design/Commenting, TF/Torch Usage)
Good design and implementation.
Spacing and comments.
No Header blocks. -1
Recognition Problem
OK solution to problem. -1
Driver Script present.
File structure present.
Good Usage & Demo & Visualisation & Data usage.
Module present.
Commenting present.
No Data leakage found.
Difficulty : Hard. Hard Difficulty : LLM
Commit Log
Good Meaningful commit messages.
Good Progressive commits.
Documentation
Readme :Acceptable. -2
Model/technical explanation :Good.
Description and Comments :Good.
Markdown used and PDF submitted.
Pull Request
Successful Pull Request (Working Algorithm Delivered on Time in Correct Branch).
No Feedback required.
Request Description is good.
TOTAL-4

Marked as per the due date and changes after which aren't necessarily allowed to contribute to grade for fairness.
Subject to approval from Shakes

@TPGCIG
Copy link
Author

TPGCIG commented Nov 24, 2025

I just want to add that I had an approved extension, I emailed this to Shakes. All elements of the PR were submitted in time of the extension.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants