Quantcast
Channel: vcf — GATK-Forum
Viewing all articles
Browse latest Browse all 624

Some questions about VQSR.

$
0
0

Can I ask three questions about VQSR:
1. I have two VCF files from two different sources, and each of them contains several samples. The sequencing processes are basically the same (library prep, sequence kit, etc.) for the two data sets. But the VSQ Lod cutoff for various tranches are different for the two sets (e.g. for file1 Tranche99.00to99.90 is VSQ Lod -100 <= x < -0.8, and -7 <= x < -1.5 for file2). I’m guessing it’s because they used different training data, but I’m not sure if my guessing was correct. i.e. If the VQSR steps were processed in the exactly same way (same training set and parameters), will two data sets get the same VSQ Lod cutoff for various tranches?
2. Does the testing data (i.e. the variant calls from certain experiment) have an impact on the VQSR results? For example, for one data set, if I do VQSR again on the filtered (only keeping the “PASS” variant loci) variants using the same training variants, will I get a different VQSR evaluation compared to the first time for the same variant loci?
3. I read that InDels and SNPs should be somehow seperately evaluated via VQSR, and I see that in some VCF files, they have TrancheINDEL99.00to99.90 and TrancheSNP99.00to99.90, etc.. However, there are also VCF files that only have Tranche99.00to99.90, etc. and no indication of InDels or SNP. Does this mean that InDel and SNPs were evaluated together? Is it a good thing?
Sorry for so many questions. I just learned about VQSR and have some confusions. Thanks in advance!


Viewing all articles
Browse latest Browse all 624

Latest Images

Trending Articles



Latest Images