Challenges with quality of race and ethnicity data in observational databases

J Am Med Inform Assoc. 2019 Aug 1;26(8-9):730-736. doi: 10.1093/jamia/ocz113.

Abstract

Objective: We sought to assess the quality of race and ethnicity information in observational health databases, including electronic health records (EHRs), and to propose patient self-recording as an improvement strategy.

Materials and methods: We assessed completeness of race and ethnicity information in large observational health databases in the United States (Healthcare Cost and Utilization Project and Optum Labs), and at a single healthcare system in New York City serving a racially and ethnically diverse population. We compared race and ethnicity data collected via administrative processes with data recorded directly by respondents via paper surveys (National Health and Nutrition Examination Survey and Hospital Consumer Assessment of Healthcare Providers and Systems). Respondent-recorded data were considered the gold standard for the collection of race and ethnicity information.

Results: Among the 160 million patients from the Healthcare Cost and Utilization Project and Optum Labs datasets, race or ethnicity was unknown for 25%. Among the 2.4 million patients in the single New York City healthcare system's EHR, race or ethnicity was unknown for 57%. However, when patients directly recorded their race and ethnicity, 86% provided clinically meaningful information, and 66% of patients reported information that was discrepant with the EHR.

Discussion: Race and ethnicity data are critical to support precision medicine initiatives and to determine healthcare disparities; however, the quality of this information in observational databases is concerning. Patient self-recording through the use of patient-facing tools can substantially increase the quality of the information while engaging patients in their health.

Conclusions: Patient self-recording may improve the completeness of race and ethnicity information.

Keywords: data quality; electronic health records; ethnic groups; patient-facing tools.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Databases, Factual*
  • Datasets as Topic
  • Electronic Health Records
  • Ethnicity* / statistics & numerical data
  • Health Care Surveys
  • Healthcare Disparities
  • Hospital Information Systems
  • Humans
  • New York City
  • Nutrition Surveys
  • Racial Groups* / statistics & numerical data
  • Retrospective Studies
  • Self Report
  • United States