The study of protein subcellular localization (PSL) is a fundamental step toward understanding the mechanism of protein function. The recent development of mass spectrometry (MS)-based spatial proteomics to quantify the distribution of proteins across subcellular fractions provides us a high-throughput approach to predict unknown PSLs based on known PSLs. However, the accuracy of PSL annotations in spatial proteomics is limited by the performance of existing PSL predictors based on traditional machine learning algorithms. In this study, we present a novel deep learning framework named DeepSP for PSL prediction of an MS-based spatial proteomics data set. DeepSP constructs the new feature map of a difference matrix by capturing detailed changes between different subcellular fractions of protein occupancy profiles and uses the convolutional block attention module to improve the prediction performance of PSL. DeepSP achieved significant improvement in accuracy and robustness for PSL prediction in independent test sets and unknown PSL prediction compared to current state-of-the-art machine learning predictors. As an efficient and robust framework for PSL prediction, DeepSP is expected to facilitate spatial proteomics studies and contributes to the elucidation of protein functions and the regulation of biological processes.
Keywords: attention mechanism; deep learning; difference matrix; protein subcellular localization; spatial proteomics.