Challenges of sequencing human genomes

Daniel C Koboldt; Li Ding; Elaine R Mardis; Richard K Wilson

doi:10.1093/bib/bbq016

Challenges of sequencing human genomes

Brief Bioinform. 2010 Sep;11(5):484-98. doi: 10.1093/bib/bbq016. Epub 2010 Jun 2.

Authors

Daniel C Koboldt¹, Li Ding, Elaine R Mardis, Richard K Wilson

Affiliation

¹ The Genome Center at Washington University, St. Louis, Missouri 63108, USA. dkoboldt@genome.wustl.edu

Abstract

Massively parallel sequencing technologies continue to alter the study of human genetics. As the cost of sequencing declines, next-generation sequencing (NGS) instruments and datasets will become increasingly accessible to the wider research community. Investigators are understandably eager to harness the power of these new technologies. Sequencing human genomes on these platforms, however, presents numerous production and bioinformatics challenges. Production issues like sample contamination, library chimaeras and variable run quality have become increasingly problematic in the transition from technology development lab to production floor. Analysis of NGS data, too, remains challenging, particularly given the short-read lengths (35-250 bp) and sheer volume of data. The development of streamlined, highly automated pipelines for data analysis is critical for transition from technology adoption to accelerated research and publication. This review aims to describe the state of current NGS technologies, as well as the strategies that enable NGS users to characterize the full spectrum of DNA sequence variation in humans.

Publication types

Research Support, N.I.H., Extramural
Review

MeSH terms

Base Sequence
Genetic Variation
Genome, Human*
Humans
Neoplasms / genetics
Sequence Analysis, DNA / instrumentation
Sequence Analysis, DNA / methods*
Software

Grants and funding

HG003079/HG/NHGRI NIH HHS/United States