Statistical object data analysis of taxonomic trees from human microbiome data

PLoS One. 2012;7(11):e48996. doi: 10.1371/journal.pone.0048996. Epub 2012 Nov 9.

Abstract

Human microbiome research characterizes the microbial content of samples from human habitats to learn how interactions between bacteria and their host might impact human health. In this work a novel parametric statistical inference method based on object-oriented data analysis (OODA) for analyzing HMP data is proposed. OODA is an emerging area of statistical inference where the goal is to apply statistical methods to objects such as functions, images, and graphs or trees. The data objects that pertain to this work are taxonomic trees of bacteria built from analysis of 16S rRNA gene sequences (e.g. using RDP); there is one such object for each biological sample analyzed. Our goal is to model and formally compare a set of trees. The contribution of our work is threefold: first, a weighted tree structure to analyze RDP data is introduced; second, using a probability measure to model a set of taxonomic trees, we introduce an approximate MLE procedure for estimating model parameters and we derive LRT statistics for comparing the distributions of two metagenomic populations; and third the Jumpstart HMP data is analyzed using the proposed model providing novel insights and future directions of analysis.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bacteria / classification*
  • Bacteria / genetics*
  • Host-Pathogen Interactions
  • Humans
  • Metagenome*
  • Metagenomics*
  • Models, Statistical*
  • RNA, Ribosomal, 16S

Substances

  • RNA, Ribosomal, 16S