Natural language processing for the surveillance of postoperative venous thromboembolism

Surgery. 2021 Oct;170(4):1175-1182. doi: 10.1016/j.surg.2021.04.027. Epub 2021 Jun 3.

Abstract

Background: The objective of this study was to develop a portal natural language processing approach to aid in the identification of postoperative venous thromboembolism events from free-text clinical notes.

Methods: We abstracted clinical notes from 25,494 operative events from 2 independent health care systems. A venous thromboembolism detected as part of the American College of Surgeons National Surgical Quality Improvement Program (ACS NSQIP) was used as the reference standard. A natural language processing engine, easy clinical information extractor-pulmonary embolism/deep vein thrombosis (EasyCIE-PEDVT), was trained to detect pulmonary embolism and deep vein thrombosis from clinical notes. International Classification of Diseases (ICD) discharge diagnosis codes for venous thromboembolism were used as baseline comparators. The classification performance of EasyCIE-PEDVT was compared with International Classification of Diseases codes using sensitivity, specificity, area under the receiver operating characteristic curve, using an internal and external validation cohort.

Results: To detect pulmonary embolism, EasyCIE-PEDVT had a sensitivity of 0.714 and 0.815 in internal and external validation, respectively. To detect deep vein thrombosis, EasyCIE-PEDVT had a sensitivity of 0.846 and 0.849 in internal and external validation, respectively. EasyCIE-PEDVT had significantly higher discrimination for deep vein thrombosis compared with International Classification of Diseases codes in internal validation (area under the receiver operating characteristic curve: 0.920 vs 0.761; P < .001) and external validation (area under the receiver operating characteristic curve: 0.921 vs 0.794; P < .001). There was no significant difference in the discrimination for pulmonary embolism between EasyCIE-PEDVT and ICD codes.

Conclusion: Accurate surveillance of postoperative venous thromboembolism may be achieved using natural language processing on clinical notes in 2 independent health care systems. These findings suggest natural language processing may augment manual chart abstraction for large registries such as NSQIP.

Publication types

  • Multicenter Study
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Cohort Studies
  • Female
  • Humans
  • Male
  • Middle Aged
  • Natural Language Processing*
  • Postoperative Complications / diagnosis*
  • Quality Improvement*
  • ROC Curve
  • Retrospective Studies
  • Venous Thrombosis / diagnosis*