GitHub - cabbi-bio/pyFLANK

Overview

pyFLANK is an open-source and automated Python implementation which detects FST outliers using a null distribution inferred from quasi-independent loci inspired by the R package OutFLANK(https://doi.org/10.1086/682949). Our tool integrates three approaches to identify loci obeying a null distribution: graph neural network (GNN) inference, linkage disequilibrium (LD)-based inference, and user-defined input. Because pyFLANK uses GNN-based inference of quasi-independent loci, it yields a more accurate null model with less need for user parameter input.

FST calculation is based on Weir and Cockerham (1984).

Key Features

Graph-based representation of local dependency context of loci and their dependence structure,
GNN-based null distribution inference,
Compatible with standard FST-based workflows,
Designed to complement, not replace, LD pruning and clumping.

Date Format

pyFLANK requires two datasets ("Import" model requires prior nuetural loci dataset in VCF format):

VCF (v4) file

Only diploid biallelic data is supported for current version. Compressed (vcf.gz) or not compressed vcf files are both supported. Phased or unphased genotype are both supported.

Population information

A population file is also required as following format:

id pop
sample1 pop1
sample2 pop1
sample3 pop2
sample4 pop3

Usage: python pyFLANK.py -h

-h, --help show this help message and exit -vcf, --vcffile input vcf file -pop, --population population file -nim, --neutral_infer_method input neutral file, available options: "Import", "LD", "GNN" -neu, --neutral_file input neutral file, which is required when set "-nim Import" -ldwin, --ld_window_bp input ld window distance (bp), which is required when set "-nim LD", default is 100kb -ldcutoff, --ld_cutoff input ld cutoff for ld prune, which is required when set "-nim LD", default is 0.1 -gnnwin, --gnn_window_bp input gnn window distance (bp), which is required when set "-nim GNN", default is 200kb -loscutoff, --loss_threshold input ld cutoff for ld prune, which is required when set "-nim LD", default is 0.1 -m, --multiTestCorrectionMethod Multiple test correction method, choose 'bonferroni' or 'fdr', the default is 'bonferroni'. -knf, --keep-negative-FST Keep negative FST values when required (such as accurate estimate of an IBD slope) -o, --outputFilePrefix outputfile prefix

Usage for the "Import" model #"Import" model requires a prior neutral variants to import. #Usage:

python pyFLANK.py -vcf sim1a.vcf.gz -pop pop1.txt -nim Import -neu which_pruned.vcf.gz -o sim1a_import [-knf]

"-neu" option used to import the prior neutral variants in VCF format as "-vcf".

Usage for the "LD" model #using "LD" model to infer the neutral variants python pyFLANK.py -vcf sim1a.vcf.gz -pop pop1.txt -nim LD [-ldwin 1000 -ldcutoff 0.1 -m Bonferroni] -o sim1a_ld [-knf]
Usage for the "GNN" model #using "GNN" model to infer the neutral variants

python pyFLANK.py -vcf sim1a.vcf.gz -pop pop1.txt -nim GNN [-gnnwin 1000 -loscutoff 0.1 -m Bonferroni] -o sim1a_gnn [-knf]

-loscutoff option used to set the loss function threshold to terminate the deep learning epoches.

Interpretation of Results Users should interpret pyFLANK results conservatively: the method aims to improve calibration and reduce redundant outlier clustering, rather than dramatically increasing power relative to established pipelines.

This repository is actively maintained, and bug reports and feature requests are welcome via GitHub Issues.

Citation

"pyFLANK, a graph neural network based null distribution inference model for FST outlier detection" (under review)

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
data		data
lib		lib
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyFLANK.ipynb		pyFLANK.ipynb
pyFLANK.py		pyFLANK.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages