Skip to content

Conversation

@JoshLoecker
Copy link
Member

This pull request introduces several improvements and refactors across the codebase, primarily focusing on standardizing data types, improving error handling, updating workflow configurations, and cleaning up unused code. The most significant changes are grouped below.

Core codebase improvements:

  • Added a new module main/como/data_types.py that centralizes and standardizes enums and data classes for configuration, logging, algorithms, and data sources, replacing previous scattered definitions.
  • Refactored error handling in main/como/cluster_rnaseq.py to use the new _log_and_raise_error utility and the LogLevel enum, improving consistency and logging for validation errors. [1] [2]
  • Updated main/como/__init__.py to import and expose new types from data_types.py, removed unused imports and the placeholder function, and improved the public API.
  • Removed the obsolete RNASeqPreparationMethod enum and its associated logic from main/como/custom_types.py, as these are now covered by the new centralized data types.

Workflow and configuration updates:

  • Updated .github/workflows/container_build.yml to only publish Docker images on version tag pushes and switched to ubuntu-latest for builds.
  • Expanded Python version matrix in .github/workflows/continuous_integration.yml to test on 3.10, 3.11, and 3.12, and improved notebook output stripping by using uv tool run. [1] [2]
  • Added target-branch: "hotfix" for GitHub Actions updates in .github/dependabot.yml to direct dependency PRs to the correct branch.

Pre-commit and linting:

  • Switched commit message linting in .pre-commit-config.yaml from commitlint to commitizen, aligning with best practices for conventional commits.

Minor bugfixes and cleanups:

  • Fixed a bug in main/como/knock_out_simulation.py by replacing np.nan with pd.NA for missing values in DataFrames, and made a minor formatting improvement to the CLI help text. [1] [2]

Note

This summary was built using GitHub Copilot

Only a single context can be processed at a time; if more contexts need to be processed, merge xomics should be wrapped in a for loop
Signed-off-by: Josh Loecker <[email protected]>
Signed-off-by: Josh Loecker <[email protected]>
Signed-off-by: Josh Loecker <[email protected]>
The Algorithms class fits better as a data type
Copilot AI review requested due to automatic review settings September 5, 2025 16:27
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces extensive refactoring and improvements across the COMO codebase, focusing on standardizing data types, modernizing function signatures, enhancing error handling, and adding new testing infrastructure. The key changes consolidate scattered type definitions into a centralized module and update CI/CD workflows.

  • Introduces centralized data types in main/como/data_types.py with standardized enums and data classes
  • Modernizes function signatures across multiple modules with improved parameter validation and async/await patterns
  • Adds comprehensive test infrastructure with new test files and fixtures
  • Updates CI/CD workflows to support broader Python version testing and improved dependency management

Reviewed Changes

Copilot reviewed 33 out of 36 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/unit/test_rnaseq_preprocess.py New test file for RNA-seq preprocessing functionality with async test patterns
tests/unit/test_data_types.py New test file validating source type ordering
tests/test_rnaseq_preprocess.py Removes deprecated argument-based tests
tests/test_proteomics.py Minor code formatting cleanup removing noqa comment
tests/fixtures/collect_files.py New test fixtures for file collection and organization
ruff.toml Configuration updates including line length increase and new ignore rules
pyproject.toml Major dependency and configuration overhaul with Python version updates
main/data/boundary_rxns/*.csv Header name standardization from "Boundary" to "Reaction"
main/como/utils.py Significant refactoring with new utility functions and improved error handling
main/como/rnaseq_preprocess.py Complete rewrite with async patterns and modernized interfaces
main/como/rnaseq_gen.py Major refactoring with new filtering algorithms and plotting capabilities
main/como/rnaseq.py File removed - functionality moved to other modules
main/como/proteomics_preprocessing.py Function signature modernization
main/como/proteomics_gen.py Async pattern adoption and improved error handling
main/como/proteomics/proteomics_preprocess.py Error handling improvements
main/como/proteomics/FTPManager.py Enhanced error handling with centralized logging
main/como/plot/*.py New plotting modules for z-score distributions and heatmaps
main/como/pipelines/build_condition_heatmaps.py New pipeline for generating condition-pathway heatmaps
main/como/merge_xomics.py Extensive refactoring with improved async patterns and data handling
main/como/knock_out_simulation.py Minor bug fix replacing np.nan with pd.NA
main/como/data_types.py New centralized data types module
main/como/custom_types.py File removed - types moved to data_types.py

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

JoshLoecker and others added 28 commits September 5, 2025 11:30
Signed-off-by: Josh Loecker <[email protected]>
Signed-off-by: Josh Loecker <[email protected]>
Signed-off-by: Josh Loecker <[email protected]>
Signed-off-by: Josh Loecker <[email protected]>
Updated several functions to use clearer argument names and added detailed docstrings following the Google style guide. Removed unnecessary async/await patterns where not needed, enhancing readability and maintainability.

Signed-off-by: Josh Loecker <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: mumo-dev <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# Conflicts:
#	.github/workflows/continuous_integration.yml
#	main/COMO.ipynb
#	main/como/__init__.py
#	main/como/combine_distributions.py
#	main/como/create_context_specific_model.py
#	main/como/data_types.py
#	main/como/merge_xomics.py
#	main/como/proteomics_gen.py
#	main/como/rnaseq_gen.py
#	main/como/rnaseq_preprocess.py
#	main/como/utils.py
#	main/data/boundary_rxns/naiveB_boundary_rxns.csv
#	pyproject.toml
#	ruff.toml
#	tests/unit/test_data_types.py
#	tests/unit/test_rnaseq_preprocess.py
#	uv.lock
Signed-off-by: Josh Loecker <[email protected]>
Signed-off-by: Josh Loecker <[email protected]>
Signed-off-by: Josh Loecker <[email protected]>
Signed-off-by: Josh Loecker <[email protected]>
Signed-off-by: Josh Loecker <[email protected]>
Signed-off-by: Josh Loecker <[email protected]>
Signed-off-by: Josh Loecker <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants