NPHL Mpox WGS Analysis

This repository hosts bioinformatics scripts designed to support the National Public Health Laboratory (NPHL) of Cameroon in their Mpox genomic surveillance efforts. The codebase focuses on analyzing WGS data for outbreak tracking and characterization, initiated in response to the identification of 5 confirmed cases in the Littoral (4) and South-West (1) regions.

Project Structure

.
├── environment.yml
├── LICENSE
├── phylogeography => Scripts for phylogenetic analysis, molecular dating, and migration plotting.
│   ├── AncestralChanges.py
│   ├── baltic.py
│   ├── compute_float_date.sh
│   ├── final_DataViz.py
│   ├── Plot_migrations.py
│   └── run_phylogenetic_tree.sh
├── README.md
└── viralrecon_MPOX => Wrapper for the nf-core/viralrecon pipeline (assembly & consensus).
    ├── run_nextclade.sh
    └── run_nf_core_viralrecon.sh

Installation & Requirements

1. nf-core/viralrecon

Ensure you have Nextflow and Docker installed

nextflow pull nf-core/viralrecon -r 2.6.0

2. Clade assignment, quality checks and phylogeography

Ensure you have Conda installed.

conda env create -f environment.yml

Usage

1. Launch nf-core/Viralrecon

Use this step to identify circulating strains and build consensus sequences from raw reads.

#Launch the script 
bash viralrecon_MPOX/run_nf_core_viralrecon.sh <DATA_DIR> <OUT_DIR> <MODE> [REF_FASTA] [REF_GFF]

Parameters:

DATA_DIR: Path to folder containing FASTQ files (required).
OUT_DIR: Path where results will be saved (required).
MODE:
- 0 = Use built-in reference (NC_063383.1).
- 1 = Provide custom FASTA/GFF (requires 4th and 5th arguments).
- Default=0.
REF_FASTA : Path to the reference .fasta file.
REF_GFF : Path to the reference .gff file.

NOTE :
Check viralrecon_MPOX/run_nf_core_viralrecon.sh lines 39-40 to ensure R1_EXT and R2_EXT match your .fastq file file extensions. Default values are :

R1_EXT='_R1_001.fastq.gz'
R2_EXT='_R2_001.fastq.gz'

2. Clade Assignment and Quality Checks

Use this step to identify clades and assess sequence quality using Nextclade.

#Ensure environment is active 
conda activate phylodynamic

#Run the analysis pipeline
bash viralrecon_MPOX/run_nextclade.sh <Sequences> <Output Directory>

Parameters:

Sequences: Path to the FASTA file with all samples sequences.
Output Directory: Path to a Output Directory.

3. Phylogeography Analysis

This module performs alignment, phylogeny (IQ-TREE), and molecular dating (TreeTime).

#Ensure environment is active 
conda activate phylodynamic

#Run the analysis pipeline
bash phylogeography/run_phylogenetic_tree.sh \
      <Sequences> \
      <Reference Genome> \
      <Date File> \
      <Locations File> \
      <Last Sample Date> \
      <Output Directory>

Parameters:

Sequences: Path to the FASTA file with all samples sequences.
Reference Genome: Path to the reference genome in FASTA format.
Date File: Path to a CSV/TSV file with columns: name, date (YYYY-MM-DD).
Location File: Path to a CSV/TSV file with columns: name, country.
Last Sample Date: The date of the last sample in the tree with the format YYYY-MM-DD
Output Directory: Path to a Output Directory.

4. Visualization (Migration Plots)

Visualize viral introductions and migration events based on the phylogeographic analysis.

#Ensure environment is active
conda activate phylodynamic

#Run the visualization script
python phylogeography/final_DataViz.py \
      --migration <OUTDIR>/mugration/mugration_results.csv \
      --pointsGeoloc <path/to/gps_coordinates.csv> --savepdf

#The results is a Dash web page accessible on http://<ip_address>:8050
## Eg : In a local PC http://127.0.0.1:8050/

#Script usage
python phylogeography/final_DataViz.py -h

Arguments for final_DataViz.py:

--migration: Path to mugration_results.csv generated by the phylogeography step.
--pointsGeoloc: Path to a CSV with columns: location, long, lat.
--origins (Optional): Filter by origin location (e.g., --origins South Center).
--destinations (Optional): Filter by destination (e.g., --destinations North-West Littoral East).
--savepdf (Optional): To save plot in pdf file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NPHL Mpox WGS Analysis

Project Structure

Installation & Requirements

1. nf-core/viralrecon

2. Clade assignment, quality checks and phylogeography

Usage

1. Launch nf-core/Viralrecon

2. Clade Assignment and Quality Checks

3. Phylogeography Analysis

4. Visualization (Migration Plots)

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
phylogeography		phylogeography
viralrecon_MPOX		viralrecon_MPOX
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Folders and files

Latest commit

History

Repository files navigation

NPHL Mpox WGS Analysis

Project Structure

Installation & Requirements

1. nf-core/viralrecon

2. Clade assignment, quality checks and phylogeography

Usage

1. Launch nf-core/Viralrecon

2. Clade Assignment and Quality Checks

3. Phylogeography Analysis

4. Visualization (Migration Plots)

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages