An exploration of ontology-based EMR data abstraction for diabetic kidney disease prediction

AMIA Jt Summits Transl Sci Proc. 2019 May 6:2019:704-713. eCollection 2019.

Abstract

Diabetic Kidney Disease (DKD) is a critical and morbid complication of diabetes and the leading cause of chronic kidney disease in the developed world. Electronic medical records (EMRs) hold promise for supporting clinical decision-making with its nationwide adoption as well as rich information characterizing patients' health care experience. However, few retrospective studies have fully utilized the EMR data to model DKD risk. This study examines the effectiveness of an unbiased data driven approach in identifying potential DKD patients in 6 months prior to onset by utilizing EMR on a broader spectrum. Meanwhile, we evaluate how different levels of data granularity of Medications and Diagnoses observations would affect prediction performance and knowledge discovery. The experimental results suggest that different data granularity may not necessarily influence the prediction accuracy, but it would dramatically change the internal structure of the predictive models.

Keywords: DKD; Data Representation; EMR ontology, Gradient Boosting Machine; Predictive Modeling.