Development and Validation of a Deep Learning Model for Detection of Allergic Reactions Using Safety Event Reports Across Hospitals

Jie Yang; Liqin Wang; Neelam A Phadke; Paige G Wickner; Christian M Mancini; Kimberly G Blumenthal; Li Zhou

doi:10.1001/jamanetworkopen.2020.22836

Development and Validation of a Deep Learning Model for Detection of Allergic Reactions Using Safety Event Reports Across Hospitals

JAMA Netw Open. 2020 Nov 2;3(11):e2022836. doi: 10.1001/jamanetworkopen.2020.22836.

Authors

Jie Yang^{1

2}, Liqin Wang^{1

2}, Neelam A Phadke^{2

3}, Paige G Wickner^{2

4}, Christian M Mancini^{2

3}, Kimberly G Blumenthal^{2

3}, Li Zhou^{1

2}

Affiliations

¹ Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, Massachusetts.
² Harvard Medical School, Boston, Massachusetts.
³ Division of Rheumatology, Allergy, and Immunology, Massachusetts General Hospital, Boston.
⁴ Division of Allergy and Clinical Immunology, Brigham and Women's Hospital, Boston, Massachusetts.

Abstract

Importance: Although critical to patient safety, health care-related allergic reactions are challenging to identify and monitor.

Objective: To develop a deep learning model to identify allergic reactions in the free-text narrative of hospital safety reports and evaluate its generalizability, efficiency, productivity, and interpretability.

Design, setting, and participants: This cross-sectional study analyzed hospital safety reports filed between May 2004 and January 2019 at Brigham and Women's Hospital and between April 2006 and June 2018 at Massachusetts General Hospital in Boston. Training and validating a deep learning model involved extracting safety reports using 101 expert-curated keywords from Massachusetts General Hospital (data set I). The model was then evaluated on 3 data sets: reports without keywords (data set II), reports from a different time frame (data set III), and reports from a different hospital (Brigham and Women's Hospital; data set IV). Statistical analyses were performed between March 1, 2019, and July 18, 2020.

Main outcomes and measures: The area under the receiver operating characteristic curve and area under the precision-recall curve were used on data set I. The precision at top-k was used on data sets II to IV.

Results: A total of 299 028 safety reports with 172 854 patients were included. Of these patients, 86 544 were women (50.1%) and the median (interquartile range [IQR]) age was 59.7 (43.8-71.6) years. The deep learning model achieved an area under the receiver operating characteristic curve of 0.979 (95% CI, 0.973-0.985) and an area under the precision-recall curve of 0.809 (95% CI, 0.773-0.845). The model achieved precisions at the top 100 model-identified cases of 0.930 in data set II, 0.960 in data set III, and 0.990 in data set IV. Compared with the keyword-search approach, the deep learning model reduced the number of cases for manual review by 63.8% and identified 24.2% more cases of confirmed allergic reactions. The model highlighted important words (eg, rash, hives, and Benadryl) in prediction and extended the list of expert-curated keywords through an attention layer.

Conclusions and relevance: This study showed that a deep learning model can accurately and efficiently identify allergic reactions using free-text narratives written by a variety of health care professionals. This model could be used to improve allergy care, potentially enabling real-time event surveillance and guidance for medical errors and system improvement.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Adult
Aged
Algorithms*
Boston
Cross-Sectional Studies
Deep Learning*
Diagnosis, Computer-Assisted / methods*
Female
Humans
Hypersensitivity / diagnosis*
Male
Middle Aged
Patient Safety / statistics & numerical data*
ROC Curve
Reproducibility of Results

Abstract

Publication types

MeSH terms

Grants and funding