At present, there are no standard criteria that have been validated for interim PET reporting in lymphoma. In 2009, an international workshop attended by hematologists and nuclear medicine experts in Deauville, France, proposed to develop simple and reproducible rules for interim PET reporting in lymphoma. Accordingly, an international validation study was undertaken with the primary aim of validating the prognostic role of interim PET using the Deauville 5-point score to evaluate images and with the secondary aim of measuring concordance rates among reviewers using the same 5-point score. This paper focuses on the criteria for interpretation of interim PET and on concordance rates.
Methods: A cohort of advanced-stage Hodgkin lymphoma patients treated with doxorubicin, bleomycin, vinblastine, and dacarbazine (ABVD) were enrolled retrospectively from centers worldwide. Baseline and interim scans were reviewed by an international panel of 6 nuclear medicine experts using the 5-point score.
Results: Complete scan datasets of acceptable diagnostic quality were available for 260 of 440 (59%) enrolled patients. Independent agreement among reviewers was reached on 252 of 260 patients (97%), for whom at least 4 reviewers agreed the findings were negative (score of 1-3) or positive (score of 4-5). After discussion, consensus was reached in all cases. There were 45 of 260 patients (17%) with positive interim PET findings and 215 of 260 patients (83%) with negative interim PET findings. Thirty-three interim PET-positive scans were true-positive, and 12 were false-positive. Two hundred three interim PET-negative scans were true-negative, and 12 were false-negative. Sensitivity, specificity, and accuracy were 0.73, 0.94, and 0.91, respectively. Negative predictive value and positive predictive value were 0.94 and 0.73, respectively. The 3-y failure-free survival was 83%, 28%, and 95% for the entire population and for interim PET-positive and -negative patients, respectively (P < 0.0001). The agreement between pairs of reviewers was good or very good, ranging from 0.69 to 0.84 as measured with the Cohen kappa. Overall agreement was good at 0.76 as measured with the Krippendorf α.
Conclusion: The 5-point score proposed at Deauville for reviewing interim PET scans in advanced Hodgkin lymphoma is accurate and reproducible enough to be accepted as a standard reporting criterion in clinical practice and for clinical trials.
Keywords: Hodgkin lymphoma; clinical trial; concordance rate; interim PET; interpretation criteria.