Have a long standing stable pipeline, which I need to tweak as can be explained and justified.
What is the difference between the old file of known indels (from Mills paper(s)) - Homo_sapiens_assembly19.kown_indels.vcf
And the current version of known indels with 1000G added - Mills_and_1000G_gold_standard.indels.hg19.sites.vcf.gz
I know the newer reference is smaller - just sites. It also has significantly fewer records/line counts. What else was removed or added from the old version until now?
Thanks in advance.