Evaluating the comparability of patient-level social risk data extracted from electronic health records: A systematic scoping review

Health Informatics J. 2023 Jul-Sep;29(3):14604582231200300. doi: 10.1177/14604582231200300.

Abstract

Objective: To evaluate how and from where social risk data are extracted from EHRs for research purposes, and how observed differences may impact study generalizability. Methods: Systematic scoping review of peer-reviewed literature that used patient-level EHR data to assess 1 ± 6 social risk domains: housing, transportation, food, utilities, safety, social support/isolation. Results: 111/9022 identified articles met inclusion criteria. By domain, social support/isolation was most often included (N = 68/111), predominantly defined by marital/partner status (N = 48/68) and extracted from structured sociodemographic data (N = 45/48). Housing risk was defined primarily by homelessness (N = 39/49). Structured housing data was extracted most from billing codes and screening tools (N = 15/30, 13/30, respectively). Across domains, data were predominantly sourced from structured fields (N = 89/111) versus unstructured free text (N = 32/111). Conclusion: We identified wide variability in how social domains are defined and extracted from EHRs for research. More consistency, particularly in how domains are operationalized, would enable greater insights across studies.

Keywords: data mining; electronic health records; machine learning; social determinants of health; social domains; social risk factors; structured data; unstructured data.

Publication types

  • Systematic Review
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.
  • Research Support, N.I.H., Extramural

MeSH terms

  • Electronic Health Records*
  • Humans
  • Social Support*