TermineR: Extracting information on endogenous proteolytic processing from shotgun proteomics data

Proteomics. 2024 Oct;24(19):e2300491. doi: 10.1002/pmic.202300491. Epub 2024 Aug 10.

Abstract

State-of-the-art mass spectrometers combined with modern bioinformatics algorithms for peptide-to-spectrum matching (PSM) with robust statistical scoring allow for more variable features (i.e., post-translational modifications) being reliably identified from (tandem-) mass spectrometry data, often without the need for biochemical enrichment. Semi-specific proteome searches, that enforce a theoretical enzymatic digestion to solely the N- or C-terminal end, allow to identify of native protein termini or those arising from endogenous proteolytic activity (also referred to as "neo-N-termini" analysis or "N-terminomics"). Nevertheless, deriving biological meaning from these search outputs can be challenging in terms of data mining and analysis. Thus, we introduce TermineR, a data analysis approach for the (1) annotation of peptides according to their enzymatic cleavage specificity and known protein processing features, (2) differential abundance and enrichment analysis of N-terminal sequence patterns, and (3) visualization of neo-N-termini location. We illustrate the use of TermineR by applying it to tandem mass tag (TMT)-based proteomics data of a mouse model of polycystic kidney disease, and assess the semi-specific searches for biological interpretation of cleavage events and the variable contribution of proteolytic products to general protein abundance. The TermineR approach and example data are available as an R package at https://github.com/MiguelCos/TermineR.

Keywords: data processing; polycystic kidney disease; proteolysis; terminomics.

MeSH terms

  • Algorithms
  • Animals
  • Databases, Protein
  • Mice
  • Peptides / analysis
  • Peptides / chemistry
  • Peptides / metabolism
  • Polycystic Kidney Diseases / metabolism
  • Protein Processing, Post-Translational
  • Proteolysis*
  • Proteome / analysis
  • Proteome / metabolism
  • Proteomics* / methods
  • Software
  • Tandem Mass Spectrometry* / methods

Substances

  • Proteome
  • Peptides