Bleeding Entity Recognition in Electronic Health Records: A Comprehensive Analysis of End-to-End Systems

AMIA Annu Symp Proc. 2021 Jan 25:2020:860-869. eCollection 2020.

Abstract

A bleeding event is a common adverse drug reaction amongst patients on anticoagulation and factors critically into a clinician's decision to prescribe or continue anticoagulation for atrial fibrillation. However, bleeding events are not uniformly captured in the administrative data of electronic health records (EHR). As manual review is prohibitively expensive, we investigate the effectiveness of various natural language processing (NLP) methods for automatic extraction of bleeding events. Using our expert-annotated 1,079 de-identified EHR notes, we evaluated state-of-the-art NLP models such as biLSTM-CRF with language modeling, and different BERT variants for six entity types. On our dataset, the biLSTM-CRF surpassed other models resulting in a macro F1-score of 0.75 whereas the performance difference is negligible for sentence and document-level predictions with the best macro F1-scores of 0.84 and 0.96, respectively. Our error analyses suggest that the models' incorrect predictions can be attributed to variability in entity spans, memorization, and missing negation signals.

MeSH terms

  • Drug-Related Side Effects and Adverse Reactions*
  • Electronic Health Records*
  • Hemorrhage / diagnosis*
  • Humans
  • Language
  • Natural Language Processing