Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database

Yukie Akune; Chi-Hung Lin; Jodie L Abrahams; Jingyu Zhang; Nicolle H Packer; Kiyoko F Aoki-Kinoshita; Matthew P Campbell

doi:10.1016/j.carres.2016.05.012

Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database

Carbohydr Res. 2016 Aug 5:431:56-63. doi: 10.1016/j.carres.2016.05.012. Epub 2016 May 30.

Authors

Yukie Akune¹, Chi-Hung Lin², Jodie L Abrahams², Jingyu Zhang², Nicolle H Packer², Kiyoko F Aoki-Kinoshita³, Matthew P Campbell⁴

Affiliations

¹ Department of Chemistry and Biomolecular Sciences, Faculty of Science & Engineering, Macquarie University, Balaclava Road, North Ryde, NSW, 2109, Australia; Department of Bioinformatics, Graduate School of Engineering, Soka University, 1-236, Tangi, Hachioji, 192-8577, Tokyo, Japan.
² Department of Chemistry and Biomolecular Sciences, Faculty of Science & Engineering, Macquarie University, Balaclava Road, North Ryde, NSW, 2109, Australia.
³ Department of Bioinformatics, Graduate School of Engineering, Soka University, 1-236, Tangi, Hachioji, 192-8577, Tokyo, Japan.
⁴ Department of Chemistry and Biomolecular Sciences, Faculty of Science & Engineering, Macquarie University, Balaclava Road, North Ryde, NSW, 2109, Australia. Electronic address: matthew.campbell@mq.edu.au.

PMID: 27318307
DOI: 10.1016/j.carres.2016.05.012

Abstract

Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database.

Keywords: Glycoinformatics; Human glycosyltransferases; N-glycan synthetic pathway.

MeSH terms

Biosynthetic Pathways
Computational Biology / methods*
Databases, Genetic*
Glycosyltransferases / chemistry
Glycosyltransferases / genetics*
Glycosyltransferases / metabolism
Humans
Molecular Structure
Polysaccharides / biosynthesis
Polysaccharides / chemistry*

Substances

Polysaccharides
Glycosyltransferases