Advancements in peptidomics have revealed numerous small open reading frames with coding potential and revealed that some of these micropeptides are closely related to human cancer. However, the systematic analysis and integration from sequence to structure and function remains largely undeveloped. Here, as a solution, we built a workflow for the collection and analysis of proteomic data, transcriptomic data, and clinical outcomes for cancer-associated micropeptides using publicly available datasets from large cohorts. We initially identified 19 586 novel micropeptides by reanalyzing proteomic profile data from 3753 samples across 8 cancer types. Further quantitative analysis of these micropeptides, along with associated clinical data, identified 3065 that were dysregulated in cancer, with 370 of them showing a strong association with prognosis. Moreover, we employed a deep learning framework to construct a micropeptide-protein interaction network for further bioinformatics analysis, revealing that micropeptides are involved in multiple biological processes as bioactive molecules. Taken together, our atlas provides a benchmark for high-throughput prediction and functional exploration of micropeptides, providing new insights into their biological mechanisms in cancer. The HMPA is freely available at http://hmpa.zju.edu.cn.
Keywords: functional annotation; mass spectrometry; micropeptide database; nonclassical peptidome; structure prediction.
© The Author(s) 2024. Published by Oxford University Press.