Comparing population structure as inferred from genealogical versus genetic information

Eur J Hum Genet. 2009 Dec;17(12):1635-41. doi: 10.1038/ejhg.2009.97. Epub 2009 Jun 24.

Abstract

Algorithms for inferring population structure from genetic data (ie, population assignment methods) have shown to effectively recognize genetic clusters in human populations. However, their performance in identifying groups of genealogically related individuals, especially in scanty-differentiated populations, has not been tested empirically thus far. For this study, we had access to both genealogical and genetic data from two closely related, isolated villages in southern Italy. We found that nearly all living individuals were included in a single pedigree, with multiple inbreeding loops. Despite F(st) between villages being a low 0.008, genetic clustering analysis identified two clusters roughly corresponding to the two villages. Average kinship between individuals (estimated from genealogies) increased at increasing values of group membership (estimated from the genetic data), showing that the observed genetic clusters represent individuals who are more closely related to each other than to random members of the population. Further, average kinship within clusters and F(st) between clusters increases with increasingly stringent membership threshold requirements. We conclude that a limited number of genetic markers is sufficient to detect structuring, and that the results of genetic analyses faithfully mirror the structuring inferred from detailed analyses of population genealogies, even when F(st) values are low, as in the case of the two villages. We then estimate the impact of observed levels of population structure on association studies using simulated data.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Databases, Genetic*
  • Family
  • Female
  • Genetic Markers
  • Genetic Predisposition to Disease
  • Genetics, Population*
  • Humans
  • Italy
  • Male
  • Pedigree
  • Phylogeny*
  • Population Dynamics*
  • Reproducibility of Results

Substances

  • Genetic Markers