Quantcast
Channel: vcf — GATK-Forum
Viewing all articles
Browse latest Browse all 624

Analysis Pipeline Discrepancy in SNP Calling and Coverage

$
0
0
Hi, All,

So I am new to GATK so please bear with me... Essentially, I have developed a unix script to analyze the fastq sequencing output for a novel targeting technique. I am only targeting 27 SNPs with a small amplicon size and the coverage is much more than traditional sequencing methods. I want to report the genotype and coverage at each location (even the homozygous reference sites). One major issue that I have witnessed is that at a given SNP in IGV, I have approx 20,000X coverage with perfectly paired reads (paired end) with MQ 60; however, following my analysis pipeline, my VCF reports 10X. I cannot figure out what I am doing wrong! Also, at other SNP sites, the VCF is reporting balanced allele depth (AD) for a given heterozygous genotype (which is correct), but the genotype call (GT) in the VCF reports as 0/0 (homozygous reference). Below is the general script, a screenshot of IGV for the SNP with 20,000X coverage, and screenshot of IGV for the SNP that is reporting as homozygous when it is a true heterozygous. Please help!

Thank you all! You are great!
Rachel



#!/bin/bash

fastqDir='DirectorywithfastqR1andR2'
refGenome='referencegenome.fa'

for i in "$fastqDir/*R1*.fastq"

do
sample=`basename $i .fastq`
file2="$fastqDir"`echo $sample | sed 's/_R1_/_R2_/'`.fastq

# look for a read-pair
if [ ! -f "$file2" ] # none detected
then
$file2 = ""
fi

bwa mem -R "@RG\tID:$sample\tSM:$sample\tPL:ILLUMINA\tLB:$sample" -t 10 $refGenome $i $file2 | samtools view -bSh | samtools sort -m 10G -o sample.bam -T Temp
samtools index sample.bam

gatk HaplotypeCaller --arguments_file gatkArgumentsFile.txt --reference $refGenome --input sample.bam --output sample.vcf --intervals SNPCoordinates.bed --emit-ref-confidence BP_RESOLUTION

#gatk HaplotypeCaller --arguments_file gatkArgumentsFile.txt --input sample.bam --output sample.2.vcf --intervals SNPCoordinates.bed --disable-tool-default-read-filters true --emit-ref-confidence BP_RESOLUTION
#bcftools mpileup -q 5 -d 9999999 -f reference.fasta sample.bam | bcftools call -g 10 -a FORMAT/DP -f GQ,GP -m -T SNPCoordinates.bed -o sample.vcf

done

Viewing all articles
Browse latest Browse all 624

Trending Articles