Objectives: Pressured healthcare resources make risk stratification and patient prioritisation fundamental issues for the investigation of colorectal cancer (CRC) in symptomatic patients. The present study uses machine learning algorithms and decision strategies to improve the appropriate use of colonoscopy.
Design: All symptomatic patients in a single health board (2018-2021) proceeding to colonoscopy to investigate for CRC were included. Machine learning algorithms (NeuralNetwork, randomForest, Logistic regression, Naïve-Bayes and Adaboost) were used to risk-stratify patients for CRC using demographics, symptoms, quantitative faecal immunochemical test (qFIT) and haematological tests. Decision curve analyses were performed to determine the optimal decision strategies.
Results: 3776 patients were included (median age, 65; M:F,0.9:1.0) and CRC was identified in 217 patients (5.7%). qFIT > 400 μg Hb/g was the most important variable (%IncMSE = 78.5). RandomForrest had the highest area under curve (0.91) and accuracy (0.80) for CRC. When utilising decision curve analysis (DCA), 30%, 46% and 54% of colonoscopies were saved at accepted CRC probabilities of 1%, 2% and 3%, respectively. RandomForrest modelling had superior net clinical benefit compared to default colonoscopy strategies.
Conclusions: MLA-derived decision strategies that account for patient and referrer risk preference reduce colonoscopy demand and carry net clinical benefit compared to default colonoscopy strategies.
Keywords: Colorectal cancer; colonoscopy; decision-curve analysis.