-
Notifications
You must be signed in to change notification settings - Fork 772
Description
Have you checked the FAQ? https://github.com/google/deepvariant/blob/r1.9/docs/FAQ.md: yes
Describe the issue:
We performed variant calling on reads obtained through Oxford Nanopore Technologies sequencing.
As a result, we observed that some sites with clearly homozygous variants were recorded as heterozygous variants in the VCF.
This phenomenon occurs at different sites regardless of whether the sample is HG002, HG003, or HG004 (the example images correspond to HG002: CNBP; chr3:129171507, and NEB; chr2:151671058).
A similar issue was reported in Issue #592, which was determined to be caused by the behavior of the WES model.
In this case as well, could the ONT_R104 model be responsible? If so, what would be the recommended way to address this?
Example
Variant calling results of HG002
・ CNBP ; chr3:129171507
・NEB ; chr2:151671058
Setup
- Operating system:Rocky Linux 8.6 (Green Obsidian)
- DeepVariant version: google/deepvariant:1.8.0
- Installation method (Docker, built from source, etc.): Docker
- Type of data: (sequencing instrument, reference genome, anything special that is unlike the case studies?)
Oxford Nanopore R10.4 flowcell, PromethIon, Adaptive sampling, ref:GRCh38_GIABv3_no_alt_analysis_set_maskedGRC_decoys_MAP2K3_KMT2C_KCNJ18.fasta
Steps to reproduce:
Execute using Nextflow(v.22.04.5.5708)
↓ Filtering reads by Chopper(v.0.11.0) (QV ≧ 10, length ≧ 1000, crop 20)
↓ Mapping by minimap2(v.2.28) (-ax map-ont)
↓ Filtering mapq(≧10) by samtools(v.1.19)
↓ Variantcall by DeepVariant(v.1.8.0)
- Command:
run_deepvariant \
--model_type=ONT_R104 --disable_small_model=true \
--ref=${refDir}/GRCh38_GIABv3_no_alt_analysis_set_maskedGRC_decoys_MAP2K3_KMT2C_KCNJ18.fasta \
--reads=${bam} --regions=${bed} \
--call_variants_extra_args="allow_empty_examples=true" \
--output_vcf=output.vcf.gz --output_gvcf=output.g.vcf.gz \
--sample_name=${sample_id}_sample --num_shards=3 \
--vcf_stats_report=true --runtime_report=true --logging_dir=logs \
--intermediate_results_dir=intermediate_results
- Error trace: (if applicable)