Most genomes to date have been sequenced without taking into account the diploid nature of the genome. However, the distribution of variants on each individual chromosome can (1) significantly impact gene regulation and protein function, (2) have important implications for analyses of population history and medical genetics, and (3) be of great value for accurate interpretation of medically relevant genetic variation. Here, we describe a comprehensive and detailed protocol for an ultra fast (<3 h library preparation), cost-effective, and scalable haplotyping method, named Contiguity Preserving Transposition sequencing or CPT-seq (Amini et al., Nat Genet 46(12):1343-1349, 2014). CPT-seq accurately phases >95 % of the whole human genome in Mb-scale phasing blocks. Additionally, the same workflow can be used to aid de novo assembly (Adey et al., Genome Res 24(12):2041-2049, 2014), detect structural variants, and perform single cell ATAC-seq analysis (Cusanovich et al., Science 348(6237):910-914, 2015).
Keywords: Assembly; CPT-seq; Combinatorial indexing; Contiguity preserving transposition; Haplotyping; Human genome; Phasing; Single cell ATAC-seq.