Raising awareness of potential biases in medical machine learning: Experience from a Datathon

Harry Hochheiser; Jesse Klug; Thomas Mathie; Tom J Pollard; Jesse D Raffa; Stephanie L Ballard; Evamarie A Conrad; Smitha Edakalavan; Allan Joseph; Nader Alnomasy; Sarah Nutman; Veronika Hill; Sumit Kapoor; Eddie Pérez Claudio; Olga V Kravchenko; Ruoting Li; Mehdi Nourelahi; Jenny Diaz; W Michael Taylor; Sydney R Rooney; Maeve Woeltje; Leo Anthony Celi; Christopher M Horvat

doi:10.1101/2024.10.21.24315543

Raising awareness of potential biases in medical machine learning: Experience from a Datathon

medRxiv [Preprint]. 2024 Nov 2:2024.10.21.24315543. doi: 10.1101/2024.10.21.24315543.

Authors

Harry Hochheiser¹, Jesse Klug², Thomas Mathie³, Tom J Pollard⁴, Jesse D Raffa⁴, Stephanie L Ballard⁵, Evamarie A Conrad⁵, Smitha Edakalavan¹, Allan Joseph^{6

7}, Nader Alnomasy^{5

8}, Sarah Nutman³, Veronika Hill⁵, Sumit Kapoor³, Eddie Pérez Claudio¹, Olga V Kravchenko⁹, Ruoting Li³, Mehdi Nourelahi¹, Jenny Diaz⁵, W Michael Taylor³, Sydney R Rooney¹⁰, Maeve Woeltje³, Leo Anthony Celi^{4

11

12}, Christopher M Horvat³

Affiliations

¹ Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA USA.
² UPMC Intensive Care Unit Service Center, UPMC, Pittsburgh, PA, USA.
³ Department of Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
⁴ MIT Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
⁵ Health Informatics, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, USA.
⁶ Division of Critical Care Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.
⁷ Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA.
⁸ College of Nursing, Medical Surgical Department, University of Ha'il, Ha'il, Saudi Arabia.
⁹ Department of Family and Community Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
¹⁰ Division of Cardiology, Department of Pediatrics, Children's Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA.
¹¹ Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA.
¹² Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

Abstract

Objective: To challenge clinicians and informaticians to learn about potential sources of bias in medical machine learning models through investigation of data and predictions from an open-source severity of illness score.

Methods: Over a two-day period (total elapsed time approximately 28 hours), we conducted a datathon that challenged interdisciplinary teams to investigate potential sources of bias in the Global Open Source Severity of Illness Score. Teams were invited to develop hypotheses, to use tools of their choosing to identify potential sources of bias, and to provide a final report.

Results: Five teams participated, three of which included both informaticians and clinicians. Most (4/5) used Python for analyses, the remaining team used R. Common analysis themes included relationship of the GOSSIS-1 prediction score with demographics and care related variables; relationships between demographics and outcomes; calibration and factors related to the context of care; and the impact of missingness. Representativeness of the population, differences in calibration and model performance among groups, and differences in performance across hospital settings were identified as possible sources of bias.

Discussion: Datathons are a promising approach for challenging developers and users to explore questions relating to unrecognized biases in medical machine learning algorithms.

Publication types

Preprint

Abstract

Publication types

Grants and funding