Mutagenic probability estimation of chemical compounds by a novel molecular electrophilicity vector and support vector machine

Bioinformatics. 2006 Sep 1;22(17):2099-106. doi: 10.1093/bioinformatics/btl352. Epub 2006 Jul 12.

Abstract

Motivation: Mutagenicity is among the toxicological end points that pose the highest concern. The accelerated pace of drug discovery has heightened the need for efficient prediction methods. Currently, most available tools fall short of the desired degree of accuracy, and can only provide a binary classification. It is of significance to develop a discriminative and informative model for the mutagenicity prediction.

Results: Here we developed a mutagenic probability prediction model addressing the problem, based on datasets covering a large chemical space. A novel molecular electrophilicity vector (MEV) is first devised to represent the structure profile of chemical compounds. An extended support vector machine (SVM) method is then used to derive the posterior probabilistic estimation of mutagenicity from the MEVs of the training set. The results show that our model gives a better performance than TOPKAT (http://www.accelrys.com) and other previously published methods. In addition, a confidence level related to the prediction can be provided, which may help people make more flexible decisions on chemical ordering or synthesis.

Availability: The binary program (ZGTOX_1.1) based on our model and samples of input datasets on Windows PC are available at http://dddc.ac.cn/adme upon request from the authors.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Artificial Intelligence
  • Hydrophobic and Hydrophilic Interactions
  • Models, Biological*
  • Models, Chemical*
  • Models, Statistical
  • Mutagenicity Tests / methods*
  • Mutagens / analysis*
  • Mutagens / chemistry*
  • Pattern Recognition, Automated* / methods
  • Quantitative Structure-Activity Relationship*
  • Software

Substances

  • Mutagens