Recommended protocol for bootstrapping HaplotypeCaller and BaseRecalibrator...
I am identifying new sequence variants/genotypes from RNA-Seq data. The species I am working with is not well studied, and there are no available datasets of reliable SNP and INDEL variants. For...
View ArticleMissing sites in vcf file
Hi, I used GenotypeGVCFs with 3 input gvcf files (3 individuals) to create a vcf file, and this seems to work, but when I examine the sites in the final vcf file, there are sites that are missing. I am...
View ArticleCombineVariants incorrectly(?) complains about badly formed variant when...
Hi GATK team, I am attempting to combine a HaplotypeCaller generated VCF with some indels called using pindel using the following arguments (GATK v3.3-0-g37228af): -R...
View ArticleWorkflow for fungal WGS
Hi all, I'm in a bit of a daze going through all the documentation and I wanted to do a sanity check on my workflow with the experts. I have ~120 WGS of a ~24Mb fungal pathogen. The end-product of my...
View ArticleWhat is the best practice for calling/combining variants across multiple...
Hi, I am working with RNA-Seq data from 6 different samples. Part of my research is to identify novel polymorphisms. I have generated a filtered vcf file for each sample. I would like to now combine...
View ArticleVariantRecalibrator - "N" reference allele only in .recal files
Hi, I ran VariantRecalibrator and ApplyRecalibration, and everything seems to have worked fine. I just have one question: if there are no reference alleles besides "N" in my recalibrate_SNP.recal and...
View ArticleLong REF but not an INDEL?
I'm getting the output from GATK ChrSy 198904 . C . 44.99 LowQual AN=2;DP=8;MQ=39.35;MQ0=0 GT:DP 0/0:8 ChrSy 198904 . CGTCCGATATTTGCGAAATATCG . Infinity . DP=8;MQ=39.35;MQ0=0 GT ./. ChrSy 464065 . A ....
View ArticleUse CombineVariants with chromosome-specific vcf files
Hi, I have generated vcf files using GenotypeGVCFs; each file contains variants corresponding to a different chromosome. I would like to use VQSR to perform the recalibration on all these data combined...
View ArticleOverlapping positions
Hi I'm wondering why some positions in VCF overlap. Can GATK skip emitting positions that are already part of an indel/concatenated ref? We need to use EMIT_ALL_SITES but it's quite confusing if a...
View ArticlePosition of Indel event based on the REF
Hi, How do I know based on the REF and ALT column of a VCF file the actual position where an indel event happened? I usually see an event that occur in the 2nd base. For example, REF ALT GGCGTGGCGT...
View ArticleHow to add TI into INFO field in vcf
Dear all, I need to generate vcf file with GATK and I need to have TI (transcript information) in INFO field and VF and GQX in FORMAT field. Could you help me please with arguments. My agruments are to...
View ArticleInconsistent DP within a site in vcf
Hi, I'm running on the Unified Genotyper (Version=3.3-0-g37228af) on a pooled sample. The ploidy is set to 32. I'm trying to get allele frequency information. I'm trying to filter sites based on depth...
View ArticleCalculateGenotypePosteriors - supporting file
Hi, I used CalculateGenotypePosteriors with the supporting file called ALL.wgs.phase3_shapeit2_mvncall_integrated_v5.20130502.sites.vcf, obtained from 1000 Genomes. It contains both indels and SNPs,...
View ArticleVariantsToTable for multi-sample VCF's?
Hi- Will VariantsToTable work for multi-sample VCF files? I've tried the following command, but it only outputs the column headers: java -jar ~/tools/GenomeAnalysisTK.jar -R chr1.fa -T VariantsToTable...
View ArticleUnusual calls after using HaplotypeCaller - filtered with VQSR and refinement...
Hi, I have discovered some unusual calls in my VCF file after using HaplotypeCaller. I am using version 3.3 of GATK. I applied VQSR as well as the genotype refinement workflow...
View ArticleWhat do sites labeled "." in FILTER field mean (after running VQSR)?
Hi, After using VQSR, I have a vcf output that contains sites labeled "." in the FILTER field. When I look at the vcf documentation (1000 genomes), it says that those are sites where filters have not...
View ArticleMSA FASTA > VCF
Hello GATK Team, Is there a tool within GATK that takes a multiple sequence alignment in FASTA format and converts to VCF? If not, could anyone point me to a tool that could do this task? Many thanks,...
View ArticleSnpEff implemented in GATK?
Hi, I have used several tools from the GATK and now I am wondering what is the next step that I should proceed. Would be great if you could give me some help. I had raw reads coming from a metagenomic...
View ArticleErrors about VCF files that are not properly sorted
This is not as common as the "wrong reference build" problem, but it still pops up every now and then: a collaborator gives you a VCF that's derived from the correct reference, but for whatever reason...
View ArticleUnifiedGenotyper doesn't generate 1 vcf per sample when bams from multiple...
we are running tests trying to get UG to produce 1 vcf per sample when inputting bams from multiple subjects. our situation is complicated slightly by the fact that each sample has 3 bams. when we...
View Article