Best way to get additional annotations?
Broad recommends using snpEff to add annotations to VCF files created by GATK. This gives annotations about the effect of a given variant: is it in a coding region? Does it cause a frameshift? What...
View ArticleForce CombineVariants to treat all vcfs as single sample
I am trying to merge two vcfs (SNVs and INDELs) from the same sample. The problem appears to be that the INDEL vcf defines "combined_sample_name" but the SNV vcf does not. So when I merge I get two...
View ArticleDifferent annotations for same co-ordinates in vcf
Please look at lines 1 and 2 taken from a vcf file, which have same Chromosome and Position and one of the Alt allele is same in both lines, different allele count and have different rsID. 1 1229111...
View ArticleBenefits of running UnifiedGenotyper on multiple samples at the same time
The best practice guide states to call variants across all samples simultaneously. Besides the ease of working with one multi-sample VCF, what advantages are there to calling the variants at the same...
View ArticleUsing the GATK API to annotate my VCF
I just quickly wrote a set of Tools to annotate my VCFs ( http://plindenbaum.blogspot.fr/2013/02/4-tools-i-wrote-today-to-annotate-vcf.html ) For example, one of those tools uses a BED/XML file indexed...
View ArticleDisagreement between HaplotypeCaller, VariantAnnotator, and ValidateVariants...
I ran the HaplotypeCaller, VariantAnnotator, and Variant Validatoor on chr3 locations from a human tumor sample. The HaplotypeCaller command line is:...
View ArticleCan SelectVariants be used to limit VCF files by interval list
Hello I would like to subset a VCF file to only save a few specific regions of the whole genome. I know some of your tools allow for an interval list to be used to subset the region analyzed. Do you...
View ArticleDepth Reporting in DP and AD changes when VariantAnnotator run
Hello, I am trying to filter some of my high-coverage samples based on a minimum depth and have found that the value stored in the DP INFO field and the AD genotype tag changes depending on whether or...
View ArticleGATK: basic VCF indel output question
Hi, I'm having problems understanding a GATK output VCF. I have read the VCF standard, but I'm obviously missing something. I /think/ I understand how SNPs and short indels are represented, but clearly...
View ArticleVariant Annotator of a multisample vcf, how to set the bam files in the args
Hi, I have a vcf containing multiple samples. I would like to put the bam files also as input for the Variant Annotator but how does the variant annotator know which bam is for wich column in the vcf?...
View Articlewrong QD value in a vcf file
Hello, Here is part of a vcf generated by GATK Unified Genotyper : chr4 106196323 . TCAGA T 32729.73 LowQD...
View ArticleFiltering VCF files
I have used the UnifiedGenotyper to call variants on a set of ~2400 genes (TruSeq Illumina data) from 28 different samples mapped against a preliminary draft genome. I do not have a defined set of SNPs...
View ArticleVCF contrasts - workflow
I am just starting with GATK but even though I have looked and looked I can not find a simple walkthrough of having many VCFs and running a range of contrasts based on sample data. I guess this must be...
View ArticleUnifiedGenotyper doesn't generate 1 vcf per sample when bams from multiple...
we are running tests trying to get UG to produce 1 vcf per sample when inputting bams from multiple subjects. our situation is complicated slightly by the fact that each sample has 3 bams. when we...
View ArticleINFO column in Mills dataset
Hi all, I have been looking for a documentation for the INFO column in the VCF of the Mills indels in the GATK resource bundle (Mills_and_1000G_gold_standard.indels.b37.sites.vcf.gz), but to no avail....
View ArticleUnifiedGenotyper failure
I have twice run UnifiedGenotyper and the resultant .vcf file contains only part of chromosome 20. I do not see what I am doing wrong. Neither do the other two people in the lab who have extensive...
View ArticleVariantAnnotator not annotating genotype columns
Hi, Could you tell me how to encourage GATK to annotate my genotype columns (i.e. add annotations to the FORMAT and PANC_R columns in the following file): #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT...
View ArticleDP and chromosome filtering
Hi team, This is two separate questions: Starting with a vcf file, plotting the depth (DP) distribution gives a nice, slightly asymmetrical bell-shaped curve. Given that SNPs with very high and very...
View ArticleVCF Writer creating stale index
I've notices on some occasions that the .vcf.ind file that is created alongside the vcf is older than the vcf itself (not by much, a second or so). I've seen this happening in small (highly scattered)...
View ArticleFalse positives in variant calls?
Hello all, We've just started using GATK in order to perform variant calling in a non-model teleost fish. The fish genome is highly repetitive (>65%), and also suffers from the whole genome...
View Article