A study of deep learning approaches for medication and adverse drug event extraction from clinical text

Qiang Wei; Zongcheng Ji; Zhiheng Li; Jingcheng Du; Jingqi Wang; Jun Xu; Yang Xiang; Firat Tiryaki; Stephen Wu; Yaoyun Zhang; Cui Tao; Hua Xu

doi:10.1093/jamia/ocz063

A study of deep learning approaches for medication and adverse drug event extraction from clinical text

J Am Med Inform Assoc. 2020 Jan 1;27(1):13-21. doi: 10.1093/jamia/ocz063.

Authors

Qiang Wei¹, Zongcheng Ji¹, Zhiheng Li², Jingcheng Du¹, Jingqi Wang¹, Jun Xu¹, Yang Xiang¹, Firat Tiryaki¹, Stephen Wu¹, Yaoyun Zhang¹, Cui Tao¹, Hua Xu¹

Affiliations

¹ School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA.
² School of Computer Science and Technology, Dalian University of Technology, Dalian, China.

Abstract

Objective: This article presents our approaches to extraction of medications and associated adverse drug events (ADEs) from clinical documents, which is the second track of the 2018 National NLP Clinical Challenges (n2c2) shared task.

Materials and methods: The clinical corpus used in this study was from the MIMIC-III database and the organizers annotated 303 documents for training and 202 for testing. Our system consists of 2 components: a named entity recognition (NER) and a relation classification (RC) component. For each component, we implemented deep learning-based approaches (eg, BI-LSTM-CRF) and compared them with traditional machine learning approaches, namely, conditional random fields for NER and support vector machines for RC, respectively. In addition, we developed a deep learning-based joint model that recognizes ADEs and their relations to medications in 1 step using a sequence labeling approach. To further improve the performance, we also investigated different ensemble approaches to generating optimal performance by combining outputs from multiple approaches.

Results: Our best-performing systems achieved F1 scores of 93.45% for NER, 96.30% for RC, and 89.05% for end-to-end evaluation, which ranked #2, #1, and #1 among all participants, respectively. Additional evaluations show that the deep learning-based approaches did outperform traditional machine learning algorithms in both NER and RC. The joint model that simultaneously recognizes ADEs and their relations to medications also achieved the best performance on RC, indicating its promise for relation extraction.

Conclusion: In this study, we developed deep learning approaches for extracting medications and their attributes such as ADEs, and demonstrated its superior performance compared with traditional machine learning algorithms, indicating its uses in broader NER and RC tasks in the medical domain.

Keywords: adverse drug events; deep learning; electronic health records; named entity recognition; relation extraction.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Deep Learning*
Drug-Related Side Effects and Adverse Reactions*
Electronic Health Records*
Humans
Information Storage and Retrieval / methods*
Machine Learning
Narration
Natural Language Processing*
Pharmaceutical Preparations

Substances

Pharmaceutical Preparations

Abstract

Publication types

MeSH terms

Substances

Grants and funding