liftOverVCF.pl
Contents |
Introduction
This script converts a VCF file from one reference build to another. It runs 3 modules within our toolkit that are necessary for lifting over a VCF.
1. LiftoverVariants walker
2. sortByRef.pl to sort the lifted-over file
3. Filter out records whose ref field no longer matches the new reference
Obtaining the Script
The liftOverVCF.pl script is available in our public source repository under the 'perl' directory. Instructions for pulling down our source are available here.
Example
./liftOverVCF.pl -vcf calls.b36.vcf \ -chain b36ToHg19.broad.over.chain \ -out calls.hg19.vcf \ -gatk /humgen/gsa-scr1/ebanks/Sting_dev -newRef /seq/references/Homo_sapiens_assembly19/v0/Homo_sapiens_assembly19 -oldRef /humgen/1kg/reference/human_b36_both -tmp /broad/shptmp [defaults to /tmp]
Usage
Running the script with no arguments will show the usage:
Usage: liftOverVCF.pl -vcf <input vcf> -gatk <path to gatk trunk> -chain <chain file> -newRef <path to new reference prefix; we will need newRef.dict, .fasta, and .fasta.fai> -oldRef <path to old reference prefix; we will need oldRef.fasta> -out <output vcf> -tmp <temp file location; defaults to /tmp>
- The 'tmp' argument is optional. It specifies the location to write the temporary file from step 1 of the process.
Chain files
Chain files from b36/hg18 to hg19 are located here within the Broad:
/humgen/gsa-hpprojects/GATK/data/Liftover_Chain_Files/
External users can get them off our ftp site:
location: ftp.broadinstitute.org username: gsapubftp-anonymous path: Liftover_Chain_Files