Item response theory model highlighting rating scale of a rubric and rater-rubric interaction in objective structured clinical examination

Masaki Uto; Jun Tsuruta; Kouji Araki; Maomi Ueno

doi:10.1371/journal.pone.0309887

Item response theory model highlighting rating scale of a rubric and rater-rubric interaction in objective structured clinical examination

PLoS One. 2024 Sep 6;19(9):e0309887. doi: 10.1371/journal.pone.0309887. eCollection 2024.

Authors

Masaki Uto¹, Jun Tsuruta², Kouji Araki³, Maomi Ueno¹

Affiliations

¹ Department of Computer and Network Engineering, The University of Electro-Communications, Chofu, Tokyo, Japan.
² Institute of Education, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo, Japan.
³ Educational System in Dentistry, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo, Japan.

Abstract

Objective structured clinical examinations (OSCEs) are a widely used performance assessment for medical and dental students. A common limitation of OSCEs is that the evaluation results depend on the characteristics of raters and a scoring rubric. To overcome this limitation, item response theory (IRT) models such as the many-facet Rasch model have been proposed to estimate examinee abilities while taking into account the characteristics of raters and evaluation items in a rubric. However, conventional IRT models have two impractical assumptions: constant rater severity across all evaluation items in a rubric and an equal interval rating scale among evaluation items, which can decrease model fitting and ability measurement accuracy. To resolve this problem, we propose a new IRT model that introduces two parameters: (1) a rater-item interaction parameter representing the rater severity for each evaluation item and (2) an item-specific step-difficulty parameter representing the difference in rating scales among evaluation items. We demonstrate the effectiveness of the proposed model by applying it to actual data collected from a medical interview test conducted at Tokyo Medical and Dental University as part of a post-clinical clerkship OSCE. The experimental results showed that the proposed model was well-fitted to our OSCE data and measured ability accurately. Furthermore, it provided abundant information on rater and item characteristics that conventional models cannot, helping us to better understand rater and item properties.

Copyright: © 2024 Uto et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

MeSH terms

Clinical Competence
Educational Measurement* / methods
Humans
Models, Theoretical
Students, Dental
Students, Medical

Grants and funding

this work was supported by JSPS KAKENHI Grant Number 19H05663. The acronym JSPS stands for “Japan Society for the Promotion of Science.” We also confirm that the funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.