Skip to content

02.Usage

Jinyang Zhang edited this page Jul 18, 2023 · 1 revision

Step 1. prepare minimap2 index

wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_43/GRCh38.primary_assembly.genome.fa.gz
gunzip GRCh38.primary_assembly.genome.fa.gz
minimap2 -t 6 -x splice -d GRCh38.primary_assembly.genome.fa.mmi GRCh38.primary_assembly.genome.fa

Step 2. Connect MinION Mk1B and insert the configuration test / FLO-MIN106D flow cell correctly

Open the MinKNOW software to make sure flow cell have been successfully recognized.

Step 3. Run PROFIT-seq

NOTE: you need to run PROFIT-seq as minknow user to access the guppy and MinKNOW server correctly.

# Switch to minknow user
sudo su - minknow
bash && source /home/biols/.bashrc

# Activate PROFIT-seq environment
mamba activate /home/biols/envs/python3.8.10

# Run PROFIT-seq
python3 PROFIT-seq.py --mm_idx /home/biols/data/hg38/GRCh38.primary_assembly.genome.fa.mmi
Usage: PROFIT-seq.py [OPTIONS]

Options:
  --minknow_host TEXT    ip address for MinKNOW host. Defaults to to 127.0.0.1. 
  --minknow_port TEXT    port for MinKNOW service. Defaults to 8000.
  --guppy_address TEXT   address for guppy server. Defaults to ipc::///tmp/.guppy/5555.
  --guppy_config TEXT    guppy basecalling config. Defaults to dna_r9.4.1_450bps_fast.
  --dashboard_port TEXT  guppy basecalling config. Defaults to 55280.
  --mm_idx TEXT          Minimap2 index of reference sequences  [required]
  --version              Show the version and exit.
  --help                 Show this message and exit.

NOTE: always use the fast basecalling config when running PROFIT-seq to avoid performance issues

If everything works fine, the prompt of url for dashboard will appear on your screen.

Step 4. Specify enrichment target or upload a toml formatted config on the web interface.

  • Example of a valid TOML config
[[jobs]]
name = "Unblock_mt"
time = [0, 240]
ch = [129, 256]
bc = "all"
target = [
    {region = ["chrM", "0", "16569"], action="enrich"},
    {region = "multi", action="unblock"},
    {region = "miss", action="unblock"},
    {region = "unmapped", action="wait"},
]
  • Usage:
- name: User-specified name for each target job.
- time: range of start and end time (minutes) for the job.
- ch: range of start and end channel for the job. (0-512 for MinION).
- bc: barcode for the job.
- target: specify what actions should be performed when aligning to specific region.
  • Available barcode options:
# bc:
- barcode01,barcode02 (comma-seperated list of barcode names, only reads with these barcodes will be processed)
- classified (all reads with classified barcodes will be processed)
- unclassified (all reads with unclassified barcodes will be processed)
- all (all reads will be processed)
  • Available target region options:
# target:
- chrom:start-end (reads that mapped to spefici region will be processed)
- multi (reads that are multi-mapped will be processed)
- mapped (all reads mapped to the reference index will be processed)
- miss (reads mapped to the reference index, but missed any target regions will be processed)
- unmapped (all reads that could not be aligned to the reference index will be processed)
- all (all reads with be processed)

Priority: all > unmapped > mapped > multi > region 
  • Available action options:
# action
- stop_receiving (finish sequencing this read) 
- unblock (reject this read)
- wait (wait for decision in the next chunk) 
- balance (balance coverage for all target regions with action `balance`)

At lease one of the following combination of actions are required for a valid job
1. unmapped + mapped
2. unmapped + regions + miss

Step 5. Start sequencing protocol

  • Start sequencing protocol in MinKNOW.
  • Click 'Run unblock' in PROFIT-seq control panel to start pore manipulation