Computational pan-genome mapping and pairwise SNP-distance improve detection of Mycobacterium tuberculosis transmission clusters

PLoS Comput Biol. 2019 Dec 9;15(12):e1007527. doi: 10.1371/journal.pcbi.1007527. eCollection 2019 Dec.

Abstract

Next-generation sequencing based base-by-base distance measures have become an integral complement to epidemiological investigation of infectious disease outbreaks. This study introduces PANPASCO, a computational pan-genome mapping based, pairwise distance method that is highly sensitive to differences between cases, even when located in regions of lineage specific reference genomes. We show that our approach is superior to previously published methods in several datasets and across different Mycobacterium tuberculosis lineages, as its characteristics allow the comparison of a high number of diverse samples in one analysis-a scenario that becomes more and more likely with the increased usage of whole-genome sequencing in transmission surveillance.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosome Mapping
  • Computational Biology
  • Computer Simulation
  • DNA, Bacterial / genetics
  • Databases, Genetic / statistics & numerical data
  • Disease Outbreaks / statistics & numerical data
  • Genome, Bacterial
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Molecular Epidemiology / statistics & numerical data
  • Mycobacterium tuberculosis / classification
  • Mycobacterium tuberculosis / genetics*
  • Polymorphism, Single Nucleotide
  • Sequence Analysis, DNA
  • Tuberculosis / epidemiology
  • Tuberculosis / microbiology
  • Tuberculosis / transmission*
  • Whole Genome Sequencing

Substances

  • DNA, Bacterial

Grants and funding

This work was supported by the competitive intramural funding of the Robert Koch Institute (Sonderforschungsmittel 2015 to BYR and WH). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.