Validity of Race, Ethnicity, and National Origin in Population-based Cancer Registries and Rapid Case Ascertainment Enhanced With a Spanish Surname List

Lisa C Clarke; Rudolph P Rull; John Z Ayanian; Robert Boer; Dennis Deapen; Dee W West; Katherine L Kahn

doi:10.1097/MLR.0b013e3182a30350

Validity of Race, Ethnicity, and National Origin in Population-based Cancer Registries and Rapid Case Ascertainment Enhanced With a Spanish Surname List

Med Care. 2016 Jan;54(1):e1-8. doi: 10.1097/MLR.0b013e3182a30350.

Authors

Lisa C Clarke¹, Rudolph P Rull, John Z Ayanian, Robert Boer, Dennis Deapen, Dee W West, Katherine L Kahn

Affiliation

¹ *Epidemiology Program, County of Marin, San Rafael, CA†School of Community Health Sciences, University of Nevada, Reno, Reno, NV‡Department of Health Research and Policy, Stanford School of Medicine, Stanford, CA§Brigham and Women's Hospital∥Department of Health Policy, Harvard Medical School, Boston, MA¶Department of Public Health, Erasmus Medical Center, Rotterdam, The Netherlands#Department of Preventive Medicine, Keck School of Medicine and Norris Comprehensive, Cancer Center, University of Southern California, Los Angeles**Cancer Prevention Institute of California, Fremont††The RAND Corporation, Santa Monica‡‡UCLA School of Medicine, Los Angeles, CA.

Abstract

Background: Accurate information regarding race, ethnicity, and national origins is critical for identifying disparities in the cancer burden.

Objectives: To examine the use of a Spanish surname list to improve the quality of race-related information obtained from rapid case ascertainment (RCA) and to estimate the accuracy of race-related information obtained from cancer registry records collected by routine reporting.

Subjects: Self-reported survey responses of 3954 participants from California enrolled in the Cancer Care Outcomes Research and Surveillance Consortium.

Measures: Sensitivity, specificity, positive predictive value, and percent agreement. We used logistic regression to identify predictors of underreporting and overreporting of a race/ethnicity.

Results: Use of the Spanish surname list increased the sensitivity of RCA for Latino ethnicity from 37% to 83%. Sensitivity for cancer registry records collected by routine reporting was ≥95% for whites, blacks, and Asians, and specificity was high for all groups (86%-100%). However, patterns of misclassification by race/ethnicity were found that could lead to biased cancer statistics for specific race/ethnicities. Discordance between self-reported and registry-reported race/ethnicity was more likely for women, Latinos, and Asians.

Conclusions: Methods to improve race and ethnicity data, such as using Spanish surnames in RCA and instituting data collection guidelines for hospitals, are needed to ensure minorities are accurately represented in clinical and epidemiological research.

Publication types

Validation Study

MeSH terms

California
Data Collection / methods*
Female
Healthcare Disparities*
Hispanic or Latino / statistics & numerical data*
Humans
Male
Neoplasms / epidemiology*
Population Surveillance / methods
Registries / standards*

Abstract

Publication types

MeSH terms

Grants and funding