-
Notifications
You must be signed in to change notification settings - Fork 9
Normalize version fields to strings in import scripts #39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Normalize version fields to strings in import scripts #39
Conversation
|
Thanks @arash77 this is a neat contribution. It's mixed together with formatting though, which albeit a great idea, it muddles what is the fix vs purely formatting. Is there a way you could split the two aspects? And if an adoption of PEP8 is desired in this repo, how about a GH Action that applies it automatically? |
|
I will exclude the formatting from this PR. I can create a separate PR to talk about how an automated formatting could be applied. |
17db00a to
74ba363
Compare
Add normalize_version_fields function to convert version fields (which can be int, float, or str) to string type for consistency. Integrate version normalization into all import scripts: - bioconda: normalize package.version - bioconductor: normalize Version - biotools: normalize version and nested version fields - galaxytool: normalize Suite_version, conda package version, and workflow versions
74ba363 to
c1bb215
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request introduces a new common utility module for normalizing version fields from numeric types to strings across various metadata import scripts, addressing data integrity issues when processing tool and package metadata.
Changes:
- Added
common/metadata.pymodule withnormalize_version_to_stringandnormalize_version_fieldsfunctions - Updated four import scripts (galaxytool-import, biotools-import, bioconductor-import, bioconda-import) to use the new normalization functions
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| common/metadata.py | New utility module providing functions to normalize version fields (integers/floats) to strings with support for nested paths and list structures |
| galaxytool-import/galaxytool-import.py | Integrated version normalization for Suite_version, Latest_suite_conda_package_version, and Related_Workflows latest_version fields |
| biotools-import/import.py | Added version field normalization for both top-level version field and nested version fields within version arrays |
| bioconductor-import/import.py | Applied normalization to the Version field in package metadata |
| bioconda-import/bioconda_importer.py | Normalized package.version field in conda package data |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Introduce a normalization function to convert version fields to strings across various import scripts, ensuring consistent data formatting. This change enhances data integrity when processing tool and package metadata.
Closes research-software-ecosystem/content#1190