Motivation: Phosphorylation, a prevalent post-translational modification, plays a crucial role in regulating cellular activities. This process encompasses O-phosphorylation (e.g., phosphoserine) and N-phosphorylation (e.g., phospho-lysine (pK), phospho-arginine (pR), and phospho-histidine (pH)). While significant research has focused on O-phosphorylation, resulting in the development of various algorithms for predicting O-phosphorylation sites with commendable performance, there has been a notable absence of models designed to predict N-phosphorylation sites. This study introduces an integrated model named DeepNphos, designed to predict N-phosphorylation sites. This model is developed based on the analysis of thousands of experimentally identified pK, pR and pH sites.
Results: Observing that the Convolutional Neural Network (CNN) model, incorporating the One-Hot encoding feature, demonstrates favorable performance in comparison to other models when predicting pK, pR, and pH sites. Additionally, pK exhibits similarities to other lysine modification types, and integrating the CNN model with a deep-transfer learning (DTL) strategy based on tens of thousands of known lysine modification sites could enhance pK prediction performance. In contrast, pR exhibits little similarity to other arginine modification types, and the integration of DTL has minimal impact on pR prediction performance. Furthermore, the decision was made to refrain from incorporating the DTL strategy in predicting pH sites, given the scarcity of histidine modification sites beyond those associated with pH. The final classifiers for predicting pK, pR, and pH sites achieve AUC values of 0.856, 0.805 and 0.802 for ten-fold cross-validation, respectively. Overall, DeepNphos is the first classifier for predicting N-phosphorylation sites, accessible at https://github.com/ChangXulinmessi/DeepNPhos.
Keywords: Arginine phosphorylation; Deep learning; Deep-transfer learning; Histidine phosphorylation; Lysine phosphorylation; N-Phosphorylation; Post-translational modification; Residual structure.
Copyright © 2024. Published by Elsevier Ltd.