Genome sequencing has identified an extensive repertoire of single nucleotide polymorphisms among clinical isolates of Mycobacterium tuberculosis, but the extent to which these differences influence phenotypic properties of the bacteria remains to be elucidated. To determine whether these polymorphisms give rise to phenotypic diversity, we have integrated genome data sets with RNA sequencing to assess their impact on the comparative transcriptome profiles of strains belonging to M. tuberculosis Lineages 1 and 2. We observed clear correlations between genotype and transcriptional phenotype. These arose by three mechanisms. First, lineage-specific changes in amino acid sequence of transcriptional regulators were associated with alterations in their ability to control gene expression. Second, changes in nucleotide sequence were associated with alteration of promoter activity and generation of novel transcriptional start sites in intergenic regions and within coding sequences. We show that in some cases this mechanism is expected to generate functionally active truncated proteins involved in innate immune recognition. Finally, genes showing lineage-specific patterns of differential expression not linked directly to primary mutations were characterized by a striking overrepresentation of toxin-antitoxin pairs. Taken together, these findings advance our understanding of mycobacterial evolution, contribute to a systems level understanding of this important human pathogen, and more broadly demonstrate the application of state-of-the-art techniques to provide novel insight into mechanisms by which intergenic and silent mutations contribute to diversity.
Keywords: RNA sequencing; genome evolution; lineage-specific mutations; transcriptional start sites; tuberculosis.