Skip to content

Model improvements#2

Open
essharom wants to merge 17 commits into
stracquadaniolab:masterfrom
biomedicalinformaticsgroup:model_improvements
Open

Model improvements#2
essharom wants to merge 17 commits into
stracquadaniolab:masterfrom
biomedicalinformaticsgroup:model_improvements

Conversation

@essharom
Copy link
Copy Markdown
Collaborator

No description provided.

essharom and others added 17 commits June 2, 2025 14:33
Updated task type descriptions and example gene/network file paths in README.
- GraphSAGE: remove activation+dropout after output layer (distorted logits)
- GIN: forward() now uses all conv layers dynamically instead of hardcoding 2
- GraphTransformer: fix dimension mismatch for num_layers > 2
- GCNII: fix loop variable shadowing, pass theta through instead of hardcoding None
- Evaluate test metric every epoch for accurate best-metric tracking
- Replace bare except clauses with except Exception
- hyperopt: remove broken sys.path.append, dead code, fix n_jobs default to 1
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces comprehensive "Model improvements" that expand the GNN-Suite pipeline to support multiple task types (binary, multiclass, regression), adds MLflow experiment tracking, implements hyperparameter optimization with Optuna, and improves the overall testing and documentation infrastructure.

Changes:

  • Added support for multiclass classification and regression tasks alongside binary classification
  • Integrated MLflow for experiment tracking with optional configuration
  • Enhanced hyperparameter optimization using YAML-based configuration and Optuna
  • Expanded CI/CD with comprehensive integration tests for all task types
  • Improved documentation with detailed parameter references and examples

Reviewed changes

Copilot reviewed 7 out of 17 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
testdata/sfari_multiclass_genes.csv New test dataset with 1256 genes for multiclass classification
nextflow.config Added parameters for task_type, MLflow integration, num_heads, and plotting control
mlflow_server.sh Script to start MLflow tracking server with SQLite backend
main.nf Enhanced workflow to support multiple task types and dynamic hyperparameter loading
conf/hyperparams.yaml Centralized hyperparameter search space configuration
bin/plot.py Updated to handle task-specific metrics and auto-detect available metrics
bin/models.py Fixed GIN concatenation logic and improved GCNII implementation
bin/load_hyperparams.py New script to load optimized hyperparameters or use defaults
bin/hyperopt.py Refactored with YAML-driven configuration and task type support
bin/gnn.py Major refactoring to support all three task types with appropriate losses and metrics
README_mlflow.md Quick start guide for MLflow integration
README.md Comprehensive documentation updates with parameter tables and examples
.gitignore Added MLflow-related directories
.github/workflows/ci.yml Extended with integration tests for all task types and hyperparameter optimization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/ci.yml
- name: Integration Test - Multiclass Classification
run: |
nextflow run . -profile test,ci,docker \
--task-type multiclass \
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter name inconsistency: Use --task_type (with underscore) instead of --task-type (with hyphen) to match the workflow parameter definition in nextflow.config.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/ci.yml

- name: Integration Test - Regression
run: |
nextflow run . -profile test,ci,docker \
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter name inconsistency: Use --task_type (with underscore) instead of --task-type (with hyphen) to match the workflow parameter definition in nextflow.config.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/ci.yml
- name: Integration Test - MLflow Enabled
run: |
nextflow run . -profile test,ci,docker \
--task-type binary \
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter name inconsistency: Use --task_type (with underscore) instead of --task-type (with hyphen) to match the workflow parameter definition in nextflow.config.

Copilot uses AI. Check for mistakes.
Comment thread README.md
Comment on lines +63 to +70
--geneFile testdata/multiclass.csv \
--networkFile testdata/multiclass_network.tsv \
--task_type multiclass

# Regression
nextflow run main.nf -profile docker \
--geneFile testdata/regression.csv \
--networkFile testdata/regression_network.tsv \
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect file paths in documentation: The test data files referenced here don't exist. According to the testdata directory listing, the actual multiclass file is sfari_multiclass_genes.csv, not multiclass.csv. Similarly for the network file and regression files. Update these examples to use the correct file paths that exist in the repository.

Suggested change
--geneFile testdata/multiclass.csv \
--networkFile testdata/multiclass_network.tsv \
--task_type multiclass
# Regression
nextflow run main.nf -profile docker \
--geneFile testdata/regression.csv \
--networkFile testdata/regression_network.tsv \
--geneFile testdata/sfari_multiclass_genes.csv \
--networkFile testdata/sfari_multiclass_network.tsv \
--task_type multiclass
# Regression
nextflow run main.nf -profile docker \
--geneFile testdata/sfari_regression_genes.csv \
--networkFile testdata/sfari_regression_network.tsv \

Copilot uses AI. Check for mistakes.
Comment thread README.md
Comment on lines +251 to +258
--geneFile testdata/genes.csv \
--networkFile testdata/multiclass_network.tsv \
--task_type multiclass

# Regression
nextflow run main.nf -profile docker -entry hyperopt \
--geneFile testdata/genes.csv \
--networkFile testdata/regression_network.tsv \
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect file paths in documentation: The multiclass and regression examples reference non-existent files. The correct file paths should be testdata/sfari_multiclass_genes.csv with testdata/sfari_multiclass_network.tsv for multiclass, and testdata/sfari_regression_genes.csv with testdata/sfari_regression_network.tsv for regression. Update these examples to match the actual test data files in the repository.

Suggested change
--geneFile testdata/genes.csv \
--networkFile testdata/multiclass_network.tsv \
--task_type multiclass
# Regression
nextflow run main.nf -profile docker -entry hyperopt \
--geneFile testdata/genes.csv \
--networkFile testdata/regression_network.tsv \
--geneFile testdata/sfari_multiclass_genes.csv \
--networkFile testdata/sfari_multiclass_network.tsv \
--task_type multiclass
# Regression
nextflow run main.nf -profile docker -entry hyperopt \
--geneFile testdata/sfari_regression_genes.csv \
--networkFile testdata/sfari_regression_network.tsv \

Copilot uses AI. Check for mistakes.
Comment thread mlflow_server.sh
@@ -0,0 +1,7 @@
# Start MLflow tracking server
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing shebang: Bash scripts should start with #!/bin/bash or #!/usr/bin/env bash to ensure proper execution. Add this as the first line of the file.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/ci.yml
- name: Integration Test - Binary Classification
run: |
nextflow run . -profile test,ci,docker \
--task-type binary \
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter name inconsistency: This test uses --task-type (with hyphen) but the workflow definition in nextflow.config defines it as task_type (with underscore). Nextflow parameters should use underscores, not hyphens. Change to --task_type to match the configuration.

Copilot uses AI. Check for mistakes.
Comment thread main.nf
Comment on lines +52 to +75
eval \$(load_hyperparams.py ${model} ${dataSet} ${resultsDir} \
${params.learning_rate} \
${params.weight_decay} \
${params.dropout} \
${params.alpha} \
${params.theta} \
${params.num_heads})

echo "# Using hyperparameters for ${model}: LR=\${LEARNING_RATE}, WD=\${WEIGHT_DECAY}, DROPOUT=\${DROPOUT}, ALPHA=\${ALPHA}, THETA=\${THETA}, NUM_HEADS=\${NUM_HEADS}" >&2

gnn.py ${geneFile} ${networkFile} \
--train-size ${params.train_size} \
--model-name ${model} \
--learning-rate ${params.learning_rate} \
--weight-decay ${params.weight_decay} \
--learning-rate \${LEARNING_RATE} \
--weight-decay \${WEIGHT_DECAY} \
--epochs ${epoch} \
--eval-threshold ${params.eval_threshold} \
--verbose-interval ${params.verbose_interval} \
--dropout ${params.dropout} \
--alpha ${params.alpha} \
--theta ${params.theta} \
> full-${model}-${epoch}-run-${run}.txt

split_data.py full-${model}-${epoch}-run-${run}.txt
--dropout \${DROPOUT} \
--alpha \${ALPHA} \
--theta \${THETA} \
--num-heads \${NUM_HEADS} \
--task-type ${params.task_type} \
> full-${model}-${epoch}-run-${run}-${params.task_type}.txt
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TrainGNN process builds a shell command using eval $(load_hyperparams.py ${model} ${dataSet} ${resultsDir} ...) with unquoted parameters, which allows command injection if any of model, dataSet, or resultsDir (or other CLI params like learning_rate) contain shell metacharacters such as ; or $(...). An attacker who can control these workflow parameters (e.g., via --dataSet, --models, or a crafted --geneFile basename) can execute arbitrary commands in the task environment, leading to full compromise of the worker node or container. Replace eval with a safer mechanism (e.g., parsing the hyperparameter script output without shell evaluation, or sourcing a strictly validated key=value file) and ensure all interpolated parameters passed to shell commands are safely quoted or avoided by using direct process invocation.

Copilot uses AI. Check for mistakes.
Comment thread mlflow_server.sh
Comment on lines +3 to +6
mlflow server \
--backend-store-uri sqlite:///mlflow.db \
--default-artifact-root ./mlflow-artifacts \
--host 0.0.0.0 \
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MLflow tracking server is started bound to 0.0.0.0 over plain HTTP with no authentication, which exposes experiment metadata and artifacts to anyone on the reachable network. In multi-user or shared environments this can allow unauthorized users to read or tamper with experiment data and models. Restrict the bind address to 127.0.0.1 or a trusted interface, and place MLflow behind authenticated, TLS-terminating infrastructure (or enable MLflow’s security features) before using this script in non-local setups.

Suggested change
mlflow server \
--backend-store-uri sqlite:///mlflow.db \
--default-artifact-root ./mlflow-artifacts \
--host 0.0.0.0 \
MLFLOW_HOST="${MLFLOW_HOST:-127.0.0.1}"
mlflow server \
--backend-store-uri sqlite:///mlflow.db \
--default-artifact-root ./mlflow-artifacts \
--host "${MLFLOW_HOST}" \

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants