Using Natural Language Processing to improve EHR Structured Data-based Surgical Site Infection Surveillance

Jianlin Shi; Siru Liu; Liese C C Pruitt; Carolyn L Luppens; Jeffrey P Ferraro; Adi V Gundlapalli; Wendy W Chapman; Brian T Bucher

Using Natural Language Processing to improve EHR Structured Data-based Surgical Site Infection Surveillance

AMIA Annu Symp Proc. 2020 Mar 4:2019:794-803. eCollection 2019.

Authors

Jianlin Shi¹, Siru Liu¹, Liese C C Pruitt¹, Carolyn L Luppens¹, Jeffrey P Ferraro^{1

2}, Adi V Gundlapalli^{1

3}, Wendy W Chapman¹, Brian T Bucher¹

Affiliations

¹ School of Medicine, University of Utah, Salt Lake City, Utah, US.
² Intermountain Healthcare, Salt Lake City, Utah, US.
³ VA Salt Lake City Healthcare System, IDEAS Center 2.0, Salt Lake City, Utah, US.

PMID: 32308875
PMCID: PMC7153106

Abstract

Surgical Site Infection surveillance in healthcare systems is labor intensive and plagued by underreporting as current methodology relies heavily on manual chart review. The rapid adoption of electronic health records (EHRs) has the potential to allow the secondary use of EHR data for quality surveillance programs. This study aims to investigate the effectiveness of integrating natural language processing (NLP) outputs with structured EHR data to build machine learning models for SSI identification using real-world clinical data. We examined a set of models using structured data with and without NLP document-level, mention-level, and keyword features. The top-performing model was based on a Random Forest classifier enhanced with NLP document-level features achieving a 0.58 sensitivity, 0.97 specificity, 0.54 PPV, 0.98 NPV, and 0.52 F_0.5 score. We further interrogated the feature contributions, analyzed the errors, and discussed future directions.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Algorithms
Decision Trees
Electronic Health Records*
Humans
Information Storage and Retrieval / methods*
Logistic Models
Machine Learning*
Natural Language Processing*
Sensitivity and Specificity
Support Vector Machine
Surgical Wound Infection / diagnosis*

Grants and funding

K08 HS025776/HS/AHRQ HHS/United States