Quantcast
Channel: vcf — GATK-Forum
Browsing latest articles
Browse All 624 View Live

Generating a vcf with the information of specific genome positions (hotspots)

Hello, I'm developing a pipeline that needs to take into account the information about variants that are present on a list of hotspots on the genome, because my final analysis uses the information...

View Article


Merging population vcf files without gvcf

Hi Everyone, I have two separate raw VCFs dataset processed by GATK version 3.5 (one from the population of ~ 2600 and one from the population of ~160). Since the upstream data cleaning and processing...

View Article


HaplotypeCaller Incompatible Contigs DNASeq

I'm using GATK 4.0.11 and I'm getting the following error message when I run HaplotypeCaller on DNAseq data: 10:19:17.089 INFO HaplotypeCaller -...

View Article

Mutect on mm10

Hello, I am trying to run mutect on mouse, and getting the following error ERROR MESSAGE: Unable to parse header with error: Your input file has a malformed header: VCFv4.2 is not a supported version,...

View Article

Analysis Pipeline Discrepancy in SNP Calling and Coverage

Hi, All, So I am new to GATK so please bear with me... Essentially, I have developed a unix script to analyze the fastq sequencing output for a novel targeting technique. I am only targeting 27 SNPs...

View Article


Removing "chr" from CHROM field

Hello! I intend to use a training resource VCF that contains "chr" in the CHR field (Reference obtained from UCSC), which is incompatible with my raw call set (reference obtained from ensembl). I check...

View Article

SelectVariants Starts Traversal but Does not Progress, High CPU Usage

Hi, I am using the GATK tool SelectVariants to only select variants that have passed FilterMutectCalls. Both FilterMutectCalls and Mutect2 were run in multi-sample mode, so the VCF being input to...

View Article

GATK v4.1.0.0 ValidateVariants, gVCF mode, error; non in v4.0.11.0

GATK v4.0.11.0 & v4.1.0.0, linux server, bash Hi, I was running the following codes ${GATK4} --java-options '-Xmx10g -XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -XX:ConcGCThreads=1...

View Article


VCF generation for somatic SNVs without "Normal Sample"

Hi there, I am trying to follow the GATK tutorial for somatic mutations for GATK4, but my data does not quite match what the example is doing. Article name: (How to) Call somatic mutations using GATK4...

View Article


How to merge the sample_X_genotyped_intervals.vcf files created by...

How to merge the sample_X_genotyped_intervals.vcf files created by PostprocessGermlineCNVCalls to a multi-sample VCF file? The files all have the same bins/records, so it should be easy to created a...

View Article

GATK4.1.0.0,HalotypeCaller VCF have 0/1 and 0|1 genotype。How to distinguish...

In the previous version(GATK-4.0.3.0), there were only 0/1 genotypes, no 0|1. In the latest version(GATK-4.1.0.0), there were 0/1 and 0|1 genotypes. What is the difference between "/" and "|", why is...

View Article

asterisc in some lines of my vcf file

Dear all, I ran the haplotype caller, in order to find germline variants in my samples (808 samples). But in the ALT column I found "*" in some lines, and I dont know what does it mean.... (I follow...

View Article

Oncotator for build hg38

The current version of Oncotator on the Broad servers is v1.9.0.0 and indicates that only hg19 is supported, are there plans to extend the tool to hg38? If not are there other recommended tools to...

View Article


Annotation problem: not all variants are taken into account

Hello, I use GATK version 4.1 to annotate a vcf with the following command : java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true...

View Article

ERROR MESSAGE: Your input file has a malformed header

Hello, I want to annotate my file through gatk 3.8 but I get this error: MESSAGE: Your input file has a malformed header: there are not enough columns present in the header line: #CHROM POS ID REF ALT...

View Article


Unable to merge gvcf files

I have used the command to merge gvcfs file using GATK java -jar /opt/apps/gatk/3.7/GenomeAnalysisTK.jar -T CombineGVCFs -R...

View Article

Mapping a locus to a chromosome in a genome

Hello I am currently working on a maize mutant and wild type data. I have the DNA-Seq data for these samples and I am currently using the GATK pipeline to analyze the data. I am trying to map the...

View Article


Transform VCF file changing only variant ids

Hello, I want to read a VCF file and write out another VCF file that is equivalent to the first one except that variant ids have been changed. How would I do that? A VCFFileReader gives me an iterator...

View Article

Correct GATK4 tools to use for combining scattered gVCFS and VCFs from...

I am running GATK on non-human data and am trying to follow the best practices as much as possible. I've now hit two separate roadblocks, both addressing similar issues: 1) Combining scattered gVCFS...

View Article

How to set GVCF genotypes too ./. based on the GQ score

Hi, I have a reasonably large non-human multi-VCF dataset containing ~280 samples and ~70M variants. I want to filter low quality genotype calls (but not variants as a whole). This does not seem to be...

View Article

Spurious insertions being called by HaplotypeCaller?

I just called variants from the same bwa-generated bam file using 1) samtools/bcftools and 2) HaplotypeCaller. Downstream analysis indicated that a particular locus looked interesting, but only in the...

View Article


Graphical (GUI) and interactive exploration tool for large genotype matrixes...

Dear GATK development team and GATK users, What is currently the best visual(GUI) and interactive genotype matrix exploration tool (a browser) for large genotype matrixes, say the 1000 human genomes...

View Article


GATK4,Cann't get right CalculateContamination result

Question regarding CalculateContamination(GATK/4.1.2.0): With CalculateContamination in tumor matched mode, I get: contamination error NaN 1.0 When I look at the tumor.table and normal. table files...

View Article

Use mutect2 or UnifiedGenotyper?

Hi, I am currently working on finding the somatic mutations in the tumor (using the sing cell sequencing result). And my final goal is to use these somatic mutations from different regions to build...

View Article

AD allele depth interpretation

Hello, I have a query on the interpretation of the AD variable in a vcf generated by calling about 800 samples together. The header defines it as: ##FORMAT= and the forum further elaborates: AD is the...

View Article


Variant filtration by allelic balance bias

Hi everyone, I'm trying to find a way to filter some heterozygous genotypes that might have been misassigned due to PCR or sequencing errors and result in a very unrealistic allelic balance bias like...

View Article

SelectVariants - java.lang.IllegalStateException: Allele in genotype not in...

Hi everyone, I'm trying to select variants with SelectVariants but for some reason it stops saying that Allele in genotype CT* not in the variant context [CT*, C]. I tryied to find a CT* in the VCF...

View Article

understand HaplotypeCaller output vcf format

Hi there, I am using GATK4.1.0.0 version on germline pair-end illumina WGS data with following command: ``` gatk4.1.0.0 --java-options '-Xmx5G' HaplotypeCaller -R...

View Article

Problem with a BED file and the flag -alleles (HaplotypeCaller)

I'm trying to pass the flag -gt_mode GENOTYPE_GIVEN_ALLELES by giving a list of SNP in a BED file. This BED file has 4 columns (chromosome, initial position, final position, allele, allele). However...

View Article



SelectVariants by sample names file

I need to subset a list of samples from a large vcf.gz file. The sample names was saved in a plain txt file, each name in a row. I used -RF -sf my.sample.names.txt but kept getting error. Any...

View Article

GATK4: RMSMappingQuality results differ between v4.0.0.0 and v4.1.1.0

Good morning everybody and thanks in advance for your advices and your help. I checked for this problem before submitting this question. I hope this is not a double. We are working with whole genome...

View Article

why variant callers's (GATK3.8 and GATK 4.0) results are different ?

hello, i am beginner . i used two different tools to analyze my data but i got the two different why ?

View Article

GenimicDBImport too slow!!!

Dear all, I'm runnig GenomicDBImport for 30 samples. It takes soo much time and after 3 days job killed for walltime exceeded limit. I want to ask you If there is a way to let it become faster. I...

View Article


The bamout file results are inconsistent with the VCF file results

Hi, I use GATK4 were analyzed, and found that took place on a site of "bamout" file multiple mutations, respectively from G mutation is T, the number of reads supported mutation is 14, and from G...

View Article

HaplotypeCaller output modes EMIT_ALL_CONFIDENT_SITES and EMIT_ALL_SITES not...

Dear GATK-Team, First of all, thank you for your great support and constant development of GATK! I was very pleased to see that the output mode options EMIT_ALL_CONFIDENT_SITES and EMIT_ALL_SITES were...

View Article

Using GATK SelectVariants to filter based on calculated allele frequency

Many of the variant callers I use, such as Pindel, do not include the AF or allele frequency value in the vcf output. However I still need to filter the vcf based on the allele frequencies of the Tumor...

View Article


How to select samples that are polymorphic on a specific locus from a joint...

Hi, I am trying to select the samples that are polymorphic on a specific locus from a joint genotyped vcf file using SelectVariants tool and JEXL expressions with no success. The command I am trying to...

View Article


Not understand the value in VCF file

Hi, I am sure if is right to ask the question here. I got the vcf file and need some help to understand the meaning. In the last two columns, there are some rows like: GT:CNADJ 0|1:2, I know 0|1...

View Article

How to get a smaller list of deNovo SNPs between 3 genotype

Hello. I am currently working on maize whole genome dataset and I have 3 samples- WT, MT and B73. I obtained the VCF files for all 3 datasets using the haplotype caller. However, the list of SNPs that...

View Article

ASEReadCounter not accepting VCF file as input

I'm trying to run ASEReadCounter, but it's not accepting a VCF file as input. I'm getting the following error: ##### ERROR MESSAGE: Invalid command line: No tribble type was provided on the command...

View Article

Mutect2 - java.lang.IllegalArgumentException: Cannot construct fragment from...

Hi, trying the latest version of Mutect2 4.1.4.0 java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false...

View Article


non-reference allele didn't be called into vcf by HC

Hi GATK team I ran HC joint calling and found out that some non-reference alleles didn't appear in the vcf. Here are how these sites looks like: Most of these allele have VAF =1 and reside on the...

View Article

How to diagnose missing MQRankSum annotations (when BaseQRankSum is available)

We wish to discover short variants in a cohort of 60 plant whole-genome-samples. We're blocked on VariantRecalibrator. We have a VCF truth set (aka resource) of SNPs which has been computed beforehand...

View Article


Extracting MQ and QUAL values for invariant sites in VCF files

I'm having problems getting mapping quality (MQ) values and PHRED called site quality scores (QUAL) for invariant sites in the VCF files generated by GATK, even when I specify that all sites should be...

View Article

How to identify duplicated genes in VCF file obtained after GATK pipeline?

I am working to find which gene type is more duplicated. I had mapped and annotated my VCF file by GATK pipeline. Please guide me how to proceed now.

View Article


GT and AD

if my vcf indicates the GT is 1/1 and the AD=14,4: what does the 14,4 indicate? 14 reads of the ALT and 4 that were not???? or something else thank you

View Article

Where can I find dbsnp_144.hg38.vcf.gz

I'm installing an application that uses files from the GATK resource bundle. I found all of the needed files at...

View Article

Are there issues with using reads coming from different technologies and...

Hello! We are analyzing a WGS data of 60 samples (6 groups, 10 samples/group) produced by HiSeq4000. The mean coverage per sample is 25x (lowest sample is 15x). Now we realized we need to sequence more...

View Article

How to add sample names in VCF?

I am using GATK best practices for germline SNPs and Indels 4.1.2.0. After mapping and recalibration, I run haplotypecaller in GVCF mode. I am combining all vcf files (output from haplotypecaller)...

View Article


Missing PS field in the VCF file produced by GenotypeGVCFs

Hello, I followed GATK best practices to produce a VCF file for 20 individuals. GATK version is 4.1.0.0. The BAM files were all verified by ValidateSamFile, no errors or warnings were detected....

View Article

Browsing latest articles
Browse All 624 View Live