Large-scale Identification of N-linked Intact Glycopeptides in Human Serum using HILIC Enrichment and Spectral Library Search

Mol Cell Proteomics. 2020 Apr;19(4):672-689. doi: 10.1074/mcp.RA119.001791. Epub 2020 Feb 26.

Abstract

Large-scale identification of N-linked intact glycopeptides by liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) in human serum is challenging because of the wide dynamic range of serum protein abundances, the lack of a complete serum N-glycan database and the existence of proteoforms. In this regard, a spectral library search method was presented for the identification of N-linked intact glycopeptides from N-linked glycoproteins in human serum with target-decoy and motif-specific false discovery rate (FDR) control. Serum proteins were firstly separated into low-abundance and high-abundance proteins by acetonitrile (ACN) precipitation. After digestion, the N-linked intact glycopeptides were enriched by hydrophilic interaction liquid chromatography (HILIC) and a portion of the enriched N-linked intact glycopeptides were processed by Peptide-N-Glycosidase F (PNGase F) to generate N-linked deglycopeptides. Both N-linked intact glycopeptides and deglycopeptides were analyzed by LC-MS/MS. From N-linked deglycopeptides data sets, 764 N-linked glycoproteins, 1699 N-linked glycosites and 3328 unique N-linked deglycopeptides were identified. Four types of N-linked glycosylation motifs (NXS/T/C/V, X≠P) were used to recognize the N-linked deglycopeptides. The spectra of these N-linked deglycopeptides were utilized for N-linked deglycopeptides library construction and identification of N-linked intact glycopeptides. A database containing 739 N-glycan masses was constructed and utilized during spectral library search for the identification of N-linked intact glycopeptides. In total, 526 N-linked glycoproteins, 1036 N-linked glycosites, 22,677 N-linked intact glycopeptides and 738 N-glycan masses were identified under 1% FDR, representing the most in-depth serum N-glycoproteome identified by LC-MS/MS at N-linked intact glycopeptide level.

Keywords: Glycomics; N-linked intact glycopeptide; bioinformatics software; co-elution; glycoprotein pathways; glycoproteins; glycoproteomics; human serum; isotopic distribution; mass spectrometry; spectral library search.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Biomarkers / blood
  • Blood Coagulation
  • Blood Proteins / analysis
  • Blood Proteins / chemistry
  • Cell Adhesion Molecules / blood
  • Cell Lineage
  • Complement System Proteins / metabolism
  • Databases, Protein
  • Glycopeptides / blood*
  • Glycopeptides / chemistry
  • Glycoproteins / blood
  • Glycoproteins / chemistry
  • Glycosylation
  • Humans
  • Hydrophobic and Hydrophilic Interactions*
  • Molecular Weight
  • Peptide Library*
  • Polysaccharides / chemistry
  • Reference Standards
  • Reproducibility of Results
  • Software

Substances

  • Biomarkers
  • Blood Proteins
  • Cell Adhesion Molecules
  • Glycopeptides
  • Glycoproteins
  • Peptide Library
  • Polysaccharides
  • Complement System Proteins