-
Notifications
You must be signed in to change notification settings - Fork 487
New Addition: ripples and general update to latest version of usher #7306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
lsterck
wants to merge
21
commits into
galaxyproject:main
Choose a base branch
from
lsterck:add_ripples
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
de920ae
version update and initial addition ripples
be50626
start of ripples addition
392ac5b
aded extra test output files
1927643
updated tools files to v 6.6
d26d227
updated test data files
805426c
update of test data files
d2e7faf
added test files for ripples
06836fe
test non deterministic, updated for assert content
7070fa7
test file renamed
d43bfb6
addition of ripples tool file
0d49338
missing test file for ripples test
5292ac7
typo fix
19d6d7e
typo fix and update test file, matutils
d25847c
fake edit to invoke test again
e09a68c
review amending and some layout fixing
a68cadb
improved test, now filechecks
09a175a
bug fix in sanitizing input
d80054c
review fixing and other improvements
ff0c24e
deleted too big test files
5825dbf
changed test to avoid to big files
436f0ab
minor improvements and review comments edits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,94 @@ | ||
| <tool id='usher_ripples' name='UShER RIPPLES' version='@TOOL_VERSION@+@GALAXY_TOOL_VERSION@' profile='@GALAXY_PROFILE@'> | ||
| <description>detect recombination events in large mutation annotated tree (MAT) files.</description> | ||
| <macros> | ||
| <import>macros.xml</import> | ||
| </macros> | ||
| <expand macro="xrefs"/> | ||
| <expand macro='requirements' /> | ||
| <expand macro="version"/> | ||
| <command detect_errors='exit_code'><![CDATA[ | ||
| ## get correct extension filenames | ||
| ln -sf '$input_mat' '$input_mat.element_identifier' && | ||
|
|
||
| ripples | ||
| --input-mat '$input_mat.element_identifier' | ||
|
|
||
| --branch-length $branch_length | ||
| --min-coordinate-range $min_coordinate_range | ||
| --max-coordinate-range $max_coordinate_range | ||
| --samples-filename '$samples_filename' | ||
| --parsimony-improvement $parsimony_improvement | ||
| --num-descendants $num_descendants | ||
|
|
||
| --outdir ./ | ||
| --threads \${GALAXY_SLOTS:-1} > output_stdout.txt | ||
|
|
||
| ]]> </command> | ||
| <inputs> | ||
| <param argument="--input-mat" type="data" format="protobuf3" label="Mutation-annotated tree object" help="Load a mutation annotated tree file, in protocol-buffers format (protobuf3)."/> | ||
| <param argument="--branch-length" type="integer" value="3" min="0" label="Minimum branch length" help="Minimum length of the branch to consider for recombination events. Default = 3." /> | ||
| <param argument="--min-coordinate-range" type="integer" value="1000" min="0" label="Minimal coordinate range" help="Minimum range of the genomic coordinates of the mutations on the recombinant branch. Default = 1,000." /> | ||
| <param argument="--max-coordinate-range" type="integer" value="10000000" min="0" label="Maximal coordinate range" help="Maximum range of the genomic coordinates of the mutations on the recombinant branch. Default = 10,000,000." /> | ||
| <param argument="--samples-filename" type="data" format="txt" label="Sample restriction file" help="Restrict the search to the ancestors of the samples specified in the input file." /> | ||
| <param argument="--parsimony-improvement" type="integer" value="3" min="0" label="Parsimony improvement" help="Minimum improvement in parsimony score of the recombinant sequences during the partial placement. Default = 3." /> | ||
| <param argument="--num-descendants" type="integer" value="10" label="Number of descendants" help="Minimum number of leaves that node should have to be considered for recombinatino. Default = 10." /> | ||
| </inputs> | ||
| <outputs> | ||
| <data name="recombination" format="tabular" from_work_dir='recombination.tsv' label="${tool.name} on ${on_string}: recombinations" > | ||
| <actions> | ||
| <action name="column_names" type="metadata" default="recomb_node_id,breakpoint-1_interval,breakpoint-2_interval,donor_node_id,donor_is_sibling,donor_parsimony,acceptor_node_id,acceptor_is_sibling,acceptor_parsimony,original_parsimony,min_starting_parsimony,recomb_parsimony" /> | ||
| </actions> | ||
| </data> | ||
| <data name="descendants" format="tabular" from_work_dir='descendants.tsv' label="${tool.name} on ${on_string}: descendants" > | ||
| <actions> | ||
| <action name="column_names" type="metadata" default="node_id,descendants" /> | ||
| </actions> | ||
| </data> | ||
|
|
||
| </outputs> | ||
| <tests> | ||
| <test expect_num_outputs="2"> | ||
| <param name="input_mat" value="mutation_annotation.pb" ftype="protobuf3"/> | ||
| <param name="samples_filename" value="sample_names.txt" ftype="txt"/> | ||
| <output name="descendants" file="test_26_descendants.tabular" ftype="tabular"/> | ||
| <output name="recombination" file="test_26_recombination.tabular" ftype="tabular"/> | ||
| </test> | ||
| <test expect_num_outputs="2"> | ||
| <param name="input_mat" value="mutation_annotation.pb" ftype="protobuf3"/> | ||
| <param name="samples_filename" value="sample_names.txt" ftype="txt"/> | ||
| <param name="num_descendants" value="20" /> | ||
| <param name="parsimony_improvement" value="5" /> | ||
| <param name="branch_length" value="2" /> | ||
| <output name="descendants" file="test_27_descendants.tabular" ftype="tabular"/> | ||
| <output name="recombination" file="test_27_recombination.tabular" ftype="tabular"/> | ||
| </test> | ||
| </tests> | ||
| <help><![CDATA[ | ||
|
|
||
| .. class:: infomark | ||
|
|
||
| **Purpose** | ||
|
|
||
| RIPPLES (Recombination Inference using Phylogenetic PLacEmentS) is a program used to detect recombination events in large mutation annotated tree (MAT) files. | ||
|
|
||
| ---- | ||
|
|
||
| RIPPLES is a program to rapidly and sensitively detect recombinant nodes and their ancestors in a mutation-annotated tree (MAT). RIPPLES exploits the fact that recombinant lineages arising from diverse genomes will often be found on “long branches” which result from accommodating the divergent evolutionary histories of the two parental haplotypes. Therefore, RIPPLES first identifies long branches in a MAT. RIPPLES then exhaustively breaks the potential recombinant sequence into distinct segments that are differentiated by mutations on the recombinant sequence and separated by up to two breakpoints. For each set of breakpoints, RIPPLES places each of its corresponding segments using maximum parsimony to find the two parental nodes – a donor and an acceptor – that result in the highest parsimony score improvement relative to the original placement on the global phylogeny. The nodes for which a set of breakpoints along with two parental nodes can be identified that provide a parsimony score improvement above a user-specified threshold are reported as recombinants. | ||
|
|
||
| .. class:: infomark | ||
|
|
||
| **RIPPLES Common Options** | ||
|
|
||
| - input-mat: Input mutation-annotated tree file [REQUIRED]. If only this argument is set, print the count of samples and nodes in the tree. | ||
| - branch-length (-l): Minimum length of the branch to consider for recombination events. Default = 3. | ||
| - min-coordinate-range (-r): Minimum range of the genomic coordinates of the mutations on the recombinant branch. Default = 1,000. | ||
| - max-coordinate-range (-R): Maximum range of the genomic coordinates of the mutations on the recombinant branch. Default = 10,000,000. | ||
| - samples-filename (-s): Restrict the search to the ancestors of the samples specified in the input file. | ||
| - parsimony-improvement (-p): Minimum improvement in parsimony score of the recombinant sequences during the partial placement. Default = 3. | ||
| - num-descendants (-n): Minimum number of leaves that node should have to be considered for recombinatino. Default = 10. | ||
|
|
||
| You can find more information in the `RIPPLES official documentation page <https://usher-wiki.readthedocs.io/en/latest/ripples.html>`_. | ||
|
|
||
| ]]> </help> | ||
| <expand macro="citations" /> | ||
| </tool> | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,4 @@ | ||
| England/BRIS-1853249/2020|20-04-02 Spain/BRIS-1853249/2020|20-04-02 | ||
| Wales/PHWC-25B04/2020|20-03-24 Spain/BRIS-1853249/2020|20-04-02 | ||
| NPL/61-TW/2020|MT072688.1|20-01-13 Spain/BRIS-1853249/2020|20-04-02 | ||
| Wales/LIVE-A6831/2020|20-03-16 Spain/BRIS-1853249/2020|20-04-02 | ||
| England/BRIS-1853249/2020|20-04-02 Spain/BRIS-1853249/2020|20-04-02_A | ||
| Wales/PHWC-25B04/2020|20-03-24 Spain/BRIS-1853249/2020|20-04-02_B | ||
| NPL/61-TW/2020|MT072688.1|20-01-13 Spain/BRIS-1853249/2020|20-04-02_C | ||
| Wales/LIVE-A6831/2020|20-03-16 Spain/BRIS-1853249/2020|20-04-02_D |
Binary file not shown.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please consider adding min/max to all integers/float params
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like to, but have no idea what sensible ranges would be ...
As I understand it is very dependent from the input/output trees used and as such it does not really make sense to "fix" ranges