SciData: a data model and ontology for semantic representation of scientific data

J Cheminform. 2016 Oct 14:8:54. doi: 10.1186/s13321-016-0168-9. eCollection 2016.

Abstract

With the move toward global, Internet enabled science there is an inherent need to capture, store, aggregate and search scientific data across a large corpus of heterogeneous data silos. As a result, standards development is needed to create an infrastructure capable of representing the diverse nature of scientific data. This paper describes a fundamental data model for scientific data that can be applied to data currently stored in any format, and an associated ontology that affords semantic representation of the structure of scientific data (and its metadata), upon which discipline specific semantics can be applied. Application of this data model to experimental and computational chemistry data are presented, implemented using JavaScript Object Notation for Linked Data. Full examples are available at the project website (Chalk in SciData: a scientific data model. http://stuchalk.github.io/scidata/, 2016).

Keywords: JSON-LD; Ontology; RDF; Science data; Scientific data model; Semantic annotation.