BOTUX: bayesian-like operational taxonomic unit examiner

Int J Comput Biol Drug Des. 2014;7(2-3):130-45. doi: 10.1504/IJCBDD.2014.061652. Epub 2014 May 28.

Abstract

Bayesian-like operational taxonomic unit examiner (BOTUX) is a new tool for the classification of 16S rRNA gene sequences into operational taxonomic units (OTUs) that addresses the problem of overestimation caused by errors introduced during PCR amplification and DNA sequencing steps. BOTUX utilises a grammar-based assignment strategy, where Bayesian models are built from each word of a given length (e.g., 8-mers). de novo analysis is possible with BOTUX as it does not require a training set, and updates probabilistic models as new sequences are recruited to an OTU. In benchmarking tests performed with real and simulated datasets of 16S rDNA sequences, BOTUX accurately identifies OTUs with comparable or better clustering efficiency and lower execution times than other OTU algorithms tested. BOTUX is the only OTU classifier, which allows incremental analysis of large datasets, and is also adept in clustering both 454 and Illumina datasets in a reasonable timeframe.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Bayes Theorem
  • RNA, Ribosomal, 16S / genetics*
  • Sequence Analysis, DNA*

Substances

  • RNA, Ribosomal, 16S