Transposable element polymorphisms recapitulate human evolution

Mob DNA. 2015 Nov 16:6:21. doi: 10.1186/s13100-015-0052-6. eCollection 2015.

Abstract

Background: The human genome contains several active families of transposable elements (TE): Alu, L1 and SVA. Germline transposition of these elements can lead to polymorphic TE (polyTE) loci that differ between individuals with respect to the presence/absence of TE insertions. Limited sets of such polyTE loci have proven to be useful as markers of ancestry in human population genetic studies, but until this time it has not been possible to analyze the full genomic complement of TE polymorphisms in this way.

Results: For the first time here, we have performed a human population genetic analysis based on a genome-wide polyTE data set consisting of 16,192 loci genotyped in 2,504 individuals across 26 human populations. PolyTEs are found at very low frequencies, > 93 % of loci show < 5 % allele frequency, consistent with the deleteriousness of TE insertions. Nevertheless, polyTEs do show substantial geographic differentiation, with numerous group-specific polymorphic insertions. African populations have the highest numbers of polyTEs and show the highest levels of polyTE genetic diversity; Alu is the most numerous and the most diverse polyTE family. PolyTE genotypes were used to compute allele sharing distances between individuals and to relate them within and between human populations. Populations and continental groups show high coherence based on individuals' polyTE genotypes, and human evolutionary relationships revealed by these genotypes are consistent with those seen for SNP-based genetic distances. The patterns of genetic diversity encoded by TE polymorphisms recapitulate broad patterns of human evolution and migration over the last 60-100,000 years. The utility of polyTEs as ancestry informative markers is further underscored by their ability to accurately predict both ancestry and admixture at the continental level. A genome-wide list of polyTE loci, along with their population group-specific allele frequencies and FST values, is provided as a resource for investigators who wish to develop panels of TE-based ancestry markers.

Conclusions: The genetic diversity represented by TE polymorphisms reflects known patterns of human evolution, and ensembles of polyTE loci are suitable for both ancestry and admixture analyses. The patterns of polyTE allelic diversity suggest the possibility that there may be a connection between TE-based genetic divergence and population-specific phenotypic differences. Graphical Abstractᅟ.

Keywords: Admixture; Alu; Ancestry informative markers; Human ancestry; L1; Phylogenetics; Polymorphism; Population genetics; SVA; Transposable elements.