Microbial Community Analysis with Ribosomal Gene Fragments from Shotgun Metagenomes

Appl Environ Microbiol. 2015 Oct 16;82(1):157-66. doi: 10.1128/AEM.02772-15. Print 2016 Jan 1.

Abstract

Shotgun metagenomic sequencing does not depend on gene-targeted primers or PCR amplification; thus, it is not affected by primer bias or chimeras. However, searching rRNA genes from large shotgun Illumina data sets is computationally expensive, and no approach exists for unsupervised community analysis of small-subunit (SSU) rRNA gene fragments retrieved from shotgun data. We present a pipeline, SSUsearch, to achieve the faster identification of short-subunit rRNA gene fragments and enabled unsupervised community analysis with shotgun data. It also includes classification and copy number correction, and the output can be used by traditional amplicon analysis platforms. Shotgun metagenome data using this pipeline yielded higher diversity estimates than amplicon data but retained the grouping of samples in ordination analyses. We applied this pipeline to soil samples with paired shotgun and amplicon data and confirmed bias against Verrucomicrobia in a commonly used V6-V8 primer set, as well as discovering likely bias against Actinobacteria and for Verrucomicrobia in a commonly used V4 primer set. This pipeline can utilize all variable regions in SSU rRNA and also can be applied to large-subunit (LSU) rRNA genes for confirmation of community structure. The pipeline can scale to handle large amounts of soil metagenomic data (5 Gb memory and 5 central processing unit hours to process 38 Gb [1 lane] of trimmed Illumina HiSeq2500 data) and is freely available at https://github.com/dib-lab/SSUsearch under a BSD license.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bacteria / classification
  • Bacteria / genetics*
  • Bacteria / isolation & purification
  • DNA Primers / genetics
  • Metagenome
  • Metagenomics
  • RNA, Ribosomal / genetics*
  • Ribosomes / genetics*
  • Soil Microbiology

Substances

  • DNA Primers
  • RNA, Ribosomal

Associated data

  • SRA/SRX902929

Grants and funding

This work was funded in part by the U.S. Department of Energy (DOE) Great Lakes Bioenergy Research Center (DOE Office of Science BER DE-FC02-07ER64494) and by DOE Office of Science grants BER DE-FG02-99ER62848 and DE-SC0004601.