Machine learning based natural language processing of radiology reports in orthopaedic trauma

A W Olthof; P Shouche; E M Fennema; F F A IJpma; R H C Koolstra; V M A Stirler; P M A van Ooijen; L J Cornelissen

doi:10.1016/j.cmpb.2021.106304

Machine learning based natural language processing of radiology reports in orthopaedic trauma

Comput Methods Programs Biomed. 2021 Sep:208:106304. doi: 10.1016/j.cmpb.2021.106304. Epub 2021 Jul 23.

Authors

A W Olthof¹, P Shouche², E M Fennema³, F F A IJpma³, R H C Koolstra⁴, V M A Stirler³, P M A van Ooijen⁵, L J Cornelissen⁶

Affiliations

¹ Department of Radiology, Treant Health Care Group, Dr. G.H. Amshoffweg 1, Hoogeveen, the Netherlands; Department of Radiation Oncology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, the Netherlands. Electronic address: a.olthof@treant.nl.
² Department of Radiation Oncology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, the Netherlands.
³ Department of Trauma Surgery, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, the Netherlands.
⁴ Department of Radiology, Treant Health Care Group, Dr. G.H. Amshoffweg 1, Hoogeveen, the Netherlands.
⁵ Department of Radiation Oncology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, the Netherlands; Machine Learning Lab, Data Science Center in Health (DASH),University Medical Center Groningen, University of Groningen, L.J. Zielstraweg 2, Groningen, the Netherlands.
⁶ Department of Radiation Oncology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, Groningen, the Netherlands; COSMONiO Imaging BV, L.J. Zielstraweg 2, Groningen, the Netherlands.

PMID: 34333208
DOI: 10.1016/j.cmpb.2021.106304

Abstract

Objectives: To compare different Machine Learning (ML) Natural Language Processing (NLP) methods to classify radiology reports in orthopaedic trauma for the presence of injuries. Assessing NLP performance is a prerequisite for downstream tasks and therefore of importance from a clinical perspective (avoiding missed injuries, quality check, insight in diagnostic yield) as well as from a research perspective (identification of patient cohorts, annotation of radiographs).

Methods: Datasets of Dutch radiology reports of injured extremities (n = 2469, 33% fractures) and chest radiographs (n = 799, 20% pneumothorax) were collected in two different hospitals and labeled by radiologists and trauma surgeons for the presence or absence of injuries. NLP classification was applied and optimized by testing different preprocessing steps and different classifiers (Rule-based, ML, and Bidirectional Encoder Representations from Transformers (BERT)). Performance was assessed by F1-score, AUC, sensitivity, specificity and accuracy.

Results: The deep learning based BERT model outperforms all other classification methods which were assessed. The model achieved an F1-score of (95 ± 2)% and accuracy of (96 ± 1)% on a dataset of simple reports (n= 2469), and an F1 of (83 ± 7)% with accuracy (93 ± 2)% on a dataset of complex reports (n= 799).

Conclusion: BERT NLP outperforms traditional ML and rule-base classifiers when applied to Dutch radiology reports in orthopaedic trauma.

Keywords: (MeSH); Informatics; Machine learning; Natural language processing; Orthopaedic trauma; Radiology.

MeSH terms

Humans
Machine Learning
Natural Language Processing
Orthopedics*
Radiography
Radiology*