This repository contains code and analysis results used in the paper Mannen et al (2023) Multiple roles of a conserved glutamate residue for unique biophysical properties in a new group of microbial rhodopsins homologous to TAT rhodopsin.
The workflow is divided into two parts: most of the analyses are covered by the default snakemake file workflow/Snakefile while the beast2 analysis is in workflow/Beast2.snakefile.
Most of the dependencies of the main workflow pipilene are taken care of with conda, but in addition to snakemake and conda, the following dependencies have to be installed manually:
- usearch is expected to be available from the
PATH; - RootAnnotator is expected to be in the directory named
RootAnnotatorin the current directory; - mad is expected to be available from the
PATH.
The protein fasta file with the expressed TwRs is provided in the file Expressed_TwRs.faa.
The workflow files are located in workflow/.
Input files to run the pipeline(s) from scratch are in input/. They include:
input/ingroup.fna-- ORF sequences for the representative TwRsinput/ingroup.tsv-- metadata for the representivatie TwRsinput/beast2/beast_linked_models.xml-- input for beast2 including the CDS alignment
Final output files are in the folder output. Immediate analysis results needed to produce them are included as well:
analysis/TAT/rhodopsins.mafft-- fasta files with alignment of all of the collected and reference rhodopsinsanalysis/IIIa_phylophlan/IIIa.tre.treefile-- results of phylogenetic analysis of Pelagibacterales subclade IIIaanalysis/diamond_collect/{gtdb,lanclos,oceandna}.tsv-- tsv files summarizing presence of rhodopsins in Pelagibacterales genomes obtained from three sourcesanalysis/metadata/{gtdb_filtered,lanclos,oceandna_filtered}.tsv-- metadata for the analyzed Pelagibacterales genomesanalysis/metadata/gtdb_clade.nwk-- tree in newick format corresponding to the o__Pelagibacterales clade in GTDB r. 214.1analysis/beast2/{beast_linked_models-codon12.trees,beast_linked_models.xml.state,beast_linked_models.log}-- results of the beast2 runanalysis/beast2/beast_linked_models-rootAnnotator_annotatedMCCTree.nexus_fixed-- (fixed) output of rootAnnotator, tree in nexus formatanalysis/lazarus-- lazarus analysis
Use the Issue tracker for questions/requests.