Ontology engineering

Nat Biotechnol. 2010 Feb;28(2):128-30. doi: 10.1038/nbt0210-128.

Abstract

Gene Ontology and similar biomedical ontologies are critical tools of today genetic research. These ontologies are crafted through a painstaking process of manual editing, and their organization relies on the intuition of human curators. Here we describe a method that uses information theory to automatically organize the structure of GO and optimize the distribution of the information within it. We used this approach to analyze the evolution of GO, and we identified several areas where the information was suboptimally organized. We optimized the structure of GO and used it to analyze 10,117 gene expression signatures. The use of this new version changed the functional interpretations of 97.5% (p < 10-3) of the signatures by, on average, 14.6%. As a result of this analysis, several changes will be introduced in the next releases of GO. We expect that these formal methods will become the standard to engineer biomedical ontologies.

Publication types

  • Letter
  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology / methods*
  • Data Mining / methods*
  • Genes / genetics*
  • Genetic Engineering / statistics & numerical data*
  • Proteins / classification*
  • Proteins / genetics*
  • Terminology as Topic*

Substances

  • Proteins