Natural Language Processing Combined with ICD-9-CM Codes as a Novel Method to Study the Epidemiology of Allergic Drug Reactions

J Allergy Clin Immunol Pract. 2020 Mar;8(3):1032-1038.e1. doi: 10.1016/j.jaip.2019.12.007. Epub 2019 Dec 16.

Abstract

Background: Allergic drug reaction epidemiologic data are sparse because it remains difficult to identify true cases in large data sets using manual chart review.

Objective: To develop and validate a novel informatics method based on natural language processing (NLP) in combination with International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes that identifies allergic drug reactions in the electronic health record.

Methods: Previously studied and high-yield ICD-9-CM codes were used to screen for possible allergic drug reactions among all inpatients admitted in 2007 and 2008. A random sample was selected for manual chart review to identify true cases of allergic drug reactions. A rule-based NLP algorithm was then developed to identify allergic drug reactions using free-text clinical notes and discharge summaries from the filtered cases. The performance of using manual chart review of ICD-9-CM codes alone was compared with ICD-9-CM codes in combination with NLP.

Results: Of 3907 cases identified by ICD-9-CM codes, 725 (19%) were randomly selected for manual chart review; 335 were confirmed as allergic drug reactions, resulting in a positive predictive value (PPV) of 46% (range: 18%-79%) when using ICD-9-CM codes alone. Our NLP algorithm in combination with ICD-9-CM codes achieved a PPV of 86% (range: 69%-100%). Among the 335 confirmed positive cases, NLP identified 259 true cases, resulting in a recall/sensitivity of 77% (range: 26%-100%). Among the 390 negative cases, NLP achieved a specificity of 89% (range: 69%-100%).

Conclusion: Using NLP with ICD-9-CM codes improved identification of allergic drug reactions. The resulting decrease in manual chart review effort will facilitate large epidemiology studies of this understudied area.

Keywords: Adverse drug reactions; Drug; Drug allergy; Electronic health record; Epidemiology; Natural language processing.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Drug Hypersensitivity* / diagnosis
  • Drug Hypersensitivity* / epidemiology
  • Electronic Health Records
  • Humans
  • International Classification of Diseases
  • Natural Language Processing
  • Pharmaceutical Preparations*

Substances

  • Pharmaceutical Preparations