Advancements in biotransformation pathway prediction: enhancements, datasets, and novel functionalities in enviPath

J Cheminform. 2024 Aug 6;16(1):93. doi: 10.1186/s13321-024-00881-6.

Abstract

enviPath is a widely used database and prediction system for microbial biotransformation pathways of primarily xenobiotic compounds. Data and prediction system are freely available both via a web interface and a public REST API. Since its initial release in 2016, we extended the data available in enviPath and improved the performance of the prediction system and usability of the overall system. We now provide three diverse data sets, covering microbial biotransformation in different environments and under different experimental conditions. This also enabled developing a pathway prediction model that is applicable to a more diverse set of chemicals. In the prediction engine, we implemented a new evaluation tailored towards pathway prediction, which returns a more honest and holistic view on the performance. We also implemented a novel applicability domain algorithm, which allows the user to estimate how well the model will perform on their data. Finally, we improved the implementation to speed up the overall system and provide new functionality via a plugin system. SCIENTIFIC CONTRIBUTION: The main scientific contributions are the development of a pathway prediction model applicable to diverse chemicals, a specialized evaluation method for holistic performance assessment, and a novel applicability domain algorithm for user-specific performance estimation. The introduction of two new data sets, and the creation of links to EC classes make enviPath a unique resource in microbial biotransformation research.

Keywords: Biodegradation database; Biodegradation pathway prediction; Machine learning; Metabolic pathways.