Comparative Analysis of Machine-Learning Model Performance in Image Analysis: The Impact of Dataset Diversity and Size

Anesth Analg. 2024 Aug 8. doi: 10.1213/ANE.0000000000007088. Online ahead of print.

Abstract

Background: This study presents an analysis of machine-learning model performance in image analysis, with a specific focus on videolaryngoscopy procedures. The research aimed to explore how dataset diversity and size affect the performance of machine-learning models, an issue vital to the advancement of clinical artificial intelligence tools.

Methods: A total of 377 videolaryngoscopy videos from YouTube were used to create 6 varied datasets, each differing in patient diversity and image count. The study also incorporates data augmentation techniques to enhance these datasets further. Two machine-learning models, YOLOv5-Small and YOLOv8-Small, were trained and evaluated on metrics such as F1 score (a statistical measure that combines the precision and recall of the model into a single metric, reflecting its overall accuracy), precision, recall, mAP@50, and mAP@50-95.

Results: The findings indicate a significant impact of dataset configuration on model performance, especially the balance between diversity and quantity. The Multi-25 × 10 dataset, featuring 25 images from 10 different patients, demonstrates superior performance, highlighting the value of a well-balanced dataset. The study also finds that the effects of data augmentation vary across different types of datasets.

Conclusions: Overall, this study emphasizes the critical role of dataset structure in the performance of machine-learning models in medical image analysis. It underscores the necessity of striking an optimal balance between dataset size and diversity, thereby illuminating the complexities inherent in data-driven machine-learning development.