Abstract
Next-generation sequencing based base-by-base distance measures have become an integral complement to epidemiological investigation of infectious disease outbreaks. This study introduces PANPASCO, a computational pan-genome mapping based, pairwise distance method that is highly sensitive to differences between cases, even when located in regions of lineage specific reference genomes. We show that our approach is superior to previously published methods in several datasets and across different Mycobacterium tuberculosis lineages, as its characteristics allow the comparison of a high number of diverse samples in one analysis-a scenario that becomes more and more likely with the increased usage of whole-genome sequencing in transmission surveillance.
Publication types
-
Research Support, Non-U.S. Gov't
MeSH terms
-
Chromosome Mapping
-
Computational Biology
-
Computer Simulation
-
DNA, Bacterial / genetics
-
Databases, Genetic / statistics & numerical data
-
Disease Outbreaks / statistics & numerical data
-
Genome, Bacterial
-
High-Throughput Nucleotide Sequencing
-
Humans
-
Molecular Epidemiology / statistics & numerical data
-
Mycobacterium tuberculosis / classification
-
Mycobacterium tuberculosis / genetics*
-
Polymorphism, Single Nucleotide
-
Sequence Analysis, DNA
-
Tuberculosis / epidemiology
-
Tuberculosis / microbiology
-
Tuberculosis / transmission*
-
Whole Genome Sequencing
Grants and funding
This work was supported by the competitive intramural funding of the Robert Koch Institute (Sonderforschungsmittel 2015 to BYR and WH). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.