Purpose: The search for small molecules with activity against Mycobacterium tuberculosis (Mtb) increasingly uses high throughput screening and computational methods. Several public datasets from the Collaborative Drug Discovery Tuberculosis (CDD TB) database have been evaluated with cheminformatics approaches to validate their utility and suggest compounds for testing.
Methods: Previously reported Bayesian classification models were used to predict a set of 283 Novartis compounds tested against Mtb (containing aerobic and anaerobic hits) and to search FDA approved drugs. The Novartis compounds were also filtered with computational SMARTS alerts to identify potentially undesirable substructures.
Results: Using the Novartis compounds as a test set for the Bayesian models demonstrated a >4.0-fold enrichment over random screening for finding aerobic hits not in the computational models (N = 34). A 10-fold enrichment was observed for finding Mtb active compounds in the FDA drugs database. 85.9% of the Novartis compounds failed the Abbott SMARTS alerts, a value substantially higher than for known TB drugs. Higher levels of failures of SMARTS filters from different groups also correlate with the number of Lipinski violations.
Conclusions: These computational approaches may assist in finding desirable leads for Tuberculosis drug discovery.