30 create a yolo training script #31

MadsJJ · 2025-11-14T19:39:59Z

Issue

Closes Create a YOLO training script #30

Proposed changes

Create yolo training script
Create slurm job?

Testing

Trains model locally with the dataset in config file, but not tested slurm script

(cherry picked from commit 56a894e)

…es to requirements (cherry picked from commit 712692d)

(cherry picked from commit 0575b16)

(cherry picked from commit 8ad50bc)

(cherry picked from commit 74fb5bf)

(cherry picked from commit 3319e8c)

(cherry picked from commit 21e82d5)

(cherry picked from commit 13ac688)

…with different names and ids (cherry picked from commit 13e5bd5)

…t downloads. This allows for simple gitignore for all datasets (cherry picked from commit c49673c)

(cherry picked from commit be47cc6)

(cherry picked from commit 14a3951)

for more information, see https://pre-commit.ci

Copilot

Pull Request Overview

This PR introduces a YOLO training script with Roboflow integration to enable automated model training on custom datasets. The implementation includes configuration management, GPU support, and SLURM cluster compatibility for high-performance computing environments.

Adds a complete YOLO training pipeline with Roboflow dataset integration
Provides flexible configuration through YAML files
Includes SLURM job script for cluster-based training

Reviewed Changes

Copilot reviewed 4 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
yolo_roboflow_training/train.py	Main training script implementing YOLO model training with Roboflow dataset downloads, device selection, and model export functionality
yolo_roboflow_training/requirements.txt	Python dependencies for the training environment, though contains numerous nonexistent package versions
yolo_roboflow_training/config.yaml	Training configuration file specifying model type, hyperparameters, and export formats
yolo_roboflow_training/Job.slurm	SLURM batch script for running training on GPU cluster with environment setup
requirements.txt	Root-level requirements file with critical encoding corruption making it unparsable
.gitignore	Updates to exclude training outputs, datasets, and Pipenv files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-14T19:45:21Z

yolo_roboflow_training/train.py

+RESULTS_DIR = config.get("results_dir")
+EXPORT_FORMATS = config.get("export_formats")
+DATASET_FORMAT = config.get("dataset_format")
+


Missing input validation for required configuration parameters. If model_type, epochs, patience, imgsz, batch, results_dir, export_formats, or dataset_format are not provided in the config file, they will be None, causing the training to fail with unclear errors. Add validation to ensure these required parameters are present.

Suggested change

# Validate required config parameters

required_params = {

"model_type": MODEL_TYPE,

"epochs": EPOCHS,

"patience": PATIENCE,

"imgsz": IMGSZ,

"batch": BATCH,

"results_dir": RESULTS_DIR,

"export_formats": EXPORT_FORMATS,

"dataset_format": DATASET_FORMAT,

}

missing_params = [k for k, v in required_params.items() if v is None]

if missing_params:

raise ValueError(f"Missing required config parameters: {', '.join(missing_params)}")

Copilot · 2025-11-14T19:45:21Z

yolo_roboflow_training/train.py

+if torch.cuda.is_available():
+    device = 0
+    print(f"Using CUDA GPU: {torch.cuda.get_device_name(0)}")
+elif torch.backends.mps.is_built():


The MPS device availability check is incorrect. torch.backends.mps.is_built() only checks if MPS support is compiled into PyTorch, not if it's actually available. Use torch.backends.mps.is_available() instead to properly detect Apple Silicon GPU availability.

Suggested change

elif torch.backends.mps.is_built():

elif torch.backends.mps.is_available():

Copilot · 2025-11-14T19:45:21Z

yolo_roboflow_training/train.py

+roboflow_data_dir.mkdir(exist_ok=True)
+rf = Roboflow(api_key=ROBOFLOW_API_KEY)
+project = rf.workspace().project(PROJECT_ID)
+versions = project.versions()


The variable versions is assigned but never used. It should either be removed or utilized in the code if dataset version validation is intended.

Suggested change

versions = project.versions()

Copilot · 2025-11-14T19:45:22Z

yolo_roboflow_training/Job.slurm

+#pip install pyyaml # used to read the configuration file
+#pip install blobfile # install blobfile to download the dataset
+#pip install kagglehub # install kagglehub to download the dataset
+pip install --force-reinstall torch -U


Inefficient package installation order. Installing packages from requirements.txt (line 34) followed by force-reinstalling torch (line 38) will reinstall torch and its dependencies, potentially causing version conflicts and wasting time. The requirements.txt already specifies torch versions, so the force-reinstall is likely unnecessary. If a specific torch build is needed, consider removing torch from requirements.txt or using --no-deps with the force-reinstall.

Suggested change

pip install --force-reinstall torch -U

pip install --force-reinstall torch -U --no-deps

Copilot · 2025-11-14T19:45:22Z

yolo_roboflow_training/train.py

+print(f"Validation metrics: {metrics}")
+
+
+# Export the trained model


Missing error handling for the model export step. If EXPORT_FORMATS is None or contains invalid formats, the export loop will fail. Add validation to ensure EXPORT_FORMATS is a non-empty list before attempting export.

Suggested change

# Export the trained model

# Export the trained model

if not isinstance(EXPORT_FORMATS, list) or not EXPORT_FORMATS or not all(isinstance(fmt, str) for fmt in EXPORT_FORMATS):

raise ValueError("EXPORT_FORMATS must be a non-empty list of strings in config.yaml.")

Copilot · 2025-11-14T19:45:22Z

yolo_roboflow_training/Job.slurm

+pip cache purge
+
+# makes sure that the pip is up to date
+python3 -m pip install --upgrade pip


Redundant pip upgrade. The pip upgrade is performed twice: once at line 24 and again at line 31 after activating the virtual environment. Remove the first upgrade at line 24 as the one inside the virtual environment (line 31) is sufficient.

Suggested change

python3 -m pip install --upgrade pip

MadsJJ added 12 commits November 14, 2025 20:29

remove duplicates from requirements.txt

fbbaca7

(cherry picked from commit 56a894e)

add training script for yolo model training with roboflow. add packag…

4f7044a

…es to requirements (cherry picked from commit 712692d)

make training script usable without cuda

999fae9

(cherry picked from commit 0575b16)

add separate requirements file for training script

352304e

(cherry picked from commit 8ad50bc)

upgrde packages in requirements in yolo training

4a979bc

(cherry picked from commit 74fb5bf)

looks like training on instance segmentation might be working

5449502

(cherry picked from commit 3319e8c)

change model type to segmentation model

1bd36bb

(cherry picked from commit 21e82d5)

add worker amount to be half of cpu count

9c3f9c8

(cherry picked from commit 13ac688)

change project_name to project_id to clear up confusion for projects …

48dd531

…with different names and ids (cherry picked from commit 13e5bd5)

change from os paths to Pathlib. Also create parent folder for datase…

67594f9

…t downloads. This allows for simple gitignore for all datasets (cherry picked from commit c49673c)

format

9cef861

(cherry picked from commit be47cc6)

add patience parameter to model

ad78e5a

(cherry picked from commit 14a3951)

MadsJJ requested review from Copilot and kluge7 November 14, 2025 19:39

MadsJJ self-assigned this Nov 14, 2025

MadsJJ added the enhancement New feature or request label Nov 14, 2025

MadsJJ linked an issue Nov 14, 2025 that may be closed by this pull request

Create a YOLO training script #30

Open

[pre-commit.ci] auto fixes from pre-commit.com hooks

9a0a332

for more information, see https://pre-commit.ci

Copilot started reviewing on behalf of MadsJJ November 14, 2025 19:40 View session

Copilot finished reviewing on behalf of MadsJJ November 14, 2025 19:44

Copilot AI reviewed Nov 14, 2025

View reviewed changes

MadsJJ mentioned this pull request Nov 14, 2025

24 yolo node for instance segmentation #32

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

30 create a yolo training script #31

30 create a yolo training script #31

Uh oh!

MadsJJ commented Nov 14, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

+# Validate required config parameters
+required_params = {
+    "model_type": MODEL_TYPE,
+    "epochs": EPOCHS,
+    "patience": PATIENCE,
+    "imgsz": IMGSZ,
+    "batch": BATCH,
+    "results_dir": RESULTS_DIR,
+    "export_formats": EXPORT_FORMATS,
+    "dataset_format": DATASET_FORMAT,
+}
+missing_params = [k for k, v in required_params.items() if v is None]
+if missing_params:
+    raise ValueError(f"Missing required config parameters: {', '.join(missing_params)}")

	elif torch.backends.mps.is_built():
	elif torch.backends.mps.is_available():

	pip install --force-reinstall torch -U
	pip install --force-reinstall torch -U --no-deps

		print(f"Validation metrics: {metrics}")


		# Export the trained model

30 create a yolo training script #31

Are you sure you want to change the base?

30 create a yolo training script #31

Uh oh!

Conversation

MadsJJ commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Proposed changes

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MadsJJ commented Nov 14, 2025 •

edited

Loading