Skip to content

Conversation

@anna-parker
Copy link
Contributor

@anna-parker anna-parker commented Oct 17, 2025

resolves #4708, #4734

partially resolves #5392, #5185 (comment)

Breaking Changes

When users submit to multi-segmented organisms and want to group multiple segments under one metadata entry they are required to add an additional fastaId column with a comma-separated list of the fastaIds (fasta header IDs) of the respective sequences. If no fastaId column is supplied the submissionId will be used instead and the backend will assume that (as in the single-segmented case) there is a one-to-one mapping of metadata submissionId to fastaId.

This requires:

Future Steps

For simplicity in this PR prepro does not yet use a minimizer index to assign segments but parses the fasta header (as the backend did before). In future steps we should

  1. do a migration to change the originalData unalignedSequences keys for older processed versions
  2. update the processedData return type to include a mapping from fasta header ID to segment
  3. have prepro use a minimizer index to assign segments

PR Checklist

🚀 Preview: https://multi-segment-submission.loculus.org

@anna-parker anna-parker changed the title feat(backend): refactor mutli-segment submission feat(backend): refactor multi-segment submission Oct 17, 2025
@anna-parker anna-parker force-pushed the multi-segment-submission branch from ebfc127 to 417d711 Compare October 18, 2025 05:52
@anna-parker anna-parker added the preview Triggers a deployment to argocd label Oct 18, 2025
@anna-parker anna-parker force-pushed the multi-segment-submission branch from 7e5f673 to d4700af Compare October 18, 2025 12:04
@anna-parker anna-parker changed the base branch from main to 4999-multi-pathogen-support-edit-page October 18, 2025 12:04
@anna-parker anna-parker force-pushed the 4999-multi-pathogen-support-edit-page branch from eb49c79 to 8aaec9d Compare October 18, 2025 12:06
@anna-parker anna-parker force-pushed the multi-segment-submission branch from d4700af to 09dae7f Compare October 18, 2025 12:07
@anna-parker anna-parker force-pushed the 4999-multi-pathogen-support-edit-page branch from 8aaec9d to 0129a69 Compare October 20, 2025 06:14
@anna-parker anna-parker force-pushed the multi-segment-submission branch from 09dae7f to 7395166 Compare October 20, 2025 06:14
@anna-parker
Copy link
Contributor Author

anna-parker commented Oct 20, 2025

As discussed we should change to a space-separated internal separator.

We could also allow a different column for each segment (and only require that the prefix be called fastaId) -> lets maybe ask on microbioinfo which option would be easier for people

Update: consensus for this option reached: https://microbial-bioinfo.slack.com/archives/CB0HYT53M/p1760961465729399

@fengelniederhammer fengelniederhammer force-pushed the 4999-multi-pathogen-support-edit-page branch from 45bdaa6 to 520c6cd Compare October 28, 2025 09:20
@anna-parker anna-parker force-pushed the multi-segment-submission branch 2 times, most recently from 4148024 to a0a6f64 Compare November 6, 2025 18:42
@anna-parker anna-parker changed the base branch from 4999-multi-pathogen-support-edit-page to edit-page-anya November 6, 2025 18:43
@anna-parker anna-parker force-pushed the multi-segment-submission branch from a0a6f64 to 33c5bac Compare November 6, 2025 18:44
@anna-parker anna-parker changed the title feat(backend): refactor multi-segment submission feat!(backend): refactor multi-segment submission Nov 7, 2025
@anna-parker anna-parker force-pushed the multi-segment-submission branch from cb15fe7 to 275eb65 Compare November 7, 2025 10:46
@anna-parker anna-parker force-pushed the multi-segment-submission branch from 28bd185 to ce63d51 Compare November 7, 2025 20:52
@anna-parker anna-parker force-pushed the multi-segment-submission branch from 53595fa to 8a305fb Compare November 7, 2025 20:57
@anna-parker anna-parker marked this pull request as ready for review November 7, 2025 21:27
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@anna-parker
Copy link
Contributor Author

I will close this in favor of #5398

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

preview Triggers a deployment to argocd update_db_schema

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Alternative method for uploading segmented viruses (or anything else with multiple contigs)

3 participants