diff --git a/README.md b/README.md index 81b3b10..0012a76 100644 --- a/README.md +++ b/README.md @@ -13,15 +13,17 @@ Manuscript: https://doi.org/10.1093/bioinformatics/btz033 ABRA2 requires Java 8. We recommend running from a pre-compiled release. -Go to the Releases tab to download a recent version. +Go to the [Releases tab](https://github.com/mozack/abra2/releases) to download a recent version. ### DNA Sample command for DNA: -```java -Xmx16G -jar abra2.jar --in normal.bam,tumor.bam --out normal.abra.bam,tumor.abra.bam --ref hg38.fa --threads 8 --targets targets.bed --tmpdir /your/tmpdir > abra.log``` +``` bash +java -Xmx16G -jar abra2.jar --in normal.bam,tumor.bam --out normal.abra.bam,tumor.abra.bam --ref hg38.fa --threads 8 --targets targets.bed --tmpdir /your/tmpdir > abra.log +``` -The above accepts normal.bam and tumor.bam as input and outputs sorted realigned BAM files named normal.abra.bam and tumor.abra.bam +The above accepts `normal.bam` and `tumor.bam` as input and outputs sorted realigned BAM files named `normal.abra.bam` and `tumor.abra.bam` * Input files must be sorted by coordinate and index * Output files are sorted @@ -30,19 +32,21 @@ The above accepts normal.bam and tumor.bam as input and outputs sorted realigned ### RNA -ABRA2 is capable of utilizing junction information to aid in assembly and realignment. It has been tested only on STAR output to date. +ABRA2 is capable of utilizing junction information to aid in assembly and realignment. It has been tested only on [STAR](https://github.com/alexdobin/STAR) output to date. Sample command for RNA: -```java -Xmx16G -jar abra2.jar --in star.bam --out star.abra.bam --ref hg38.fa --junctions bam --threads 8 --gtf gencode.v26.annotation.gtf --dist 500000 --sua --tmpdir /your/tmpdir > abra2.log 2>&1``` +``` bash +java -Xmx16G -jar abra2.jar --in star.bam --out star.abra.bam --ref hg38.fa --junctions bam --threads 8 --gtf gencode.v26.annotation.gtf --dist 500000 --sua --tmpdir /your/tmpdir > abra2.log 2>&1 +``` -Here, star.bam is the input bam file and star.abra.bam is the output bam file. +Here, `star.bam` is the input bam file and `star.abra.bam` is the output bam file. -Junctions observed during alignment can be passed in using the ```--junctions``` param. The input file format is similar to the SJ.out.tab file output by STAR. If ```bam``` is specified, ABRA2 will dynamically identify splice junctions from the BAM file on the fly. Note that the SJ.out.tab file contains only junctions deemed "high quality" by STAR. The complete set of all splice junctions can be identified using the program ```abra.cadabra.SpliceJunctionCounter``` +Junctions observed during alignment can be passed in using the ```--junctions``` param. The input file format is similar to the `SJ.out.tab` file output by STAR. If ```bam``` is specified, ABRA2 will dynamically identify splice junctions from the BAM file on the fly. Note that the `SJ.out.tab` file contains only junctions deemed "high quality" by STAR. The complete set of all splice junctions can be identified using the program ```abra.cadabra.SpliceJunctionCounter``` -Annotated junctions can be passed in using the ```--gtf``` param. See: https://www.gencodegenes.org/releases/current.html -It is beneficial to use both of the junction related options. +Annotated junctions can be passed in using the ```--gtf``` param. For human annotations, see: https://www.gencodegenes.org/human/releases.html -Known indels can be passed in using the --in-vcf argument. Unannotated junctions originally identified as splices by the aligner may be converted to deletions if a known deletion is matched. Consider this option if you have indels detected from DNA for the same sample / subject. It is not recommended to use large datasets when using this option (i.e. don't pass in dbSNP). +It is beneficial to use both of the junction related options. +Known indels can be passed in using the ```--in-vcf``` argument. Unannotated junctions originally identified as splices by the aligner may be converted to deletions if a known deletion is matched. Consider this option if you have indels detected from DNA for the same sample / subject. It is not recommended to use large datasets when using this option (i.e. don't pass in dbSNP).