A 12-hospital prospective evaluation of a clinical decision support prognostic algorithm based on logistic regression as a form of machine learning to facilitate decision making for patients with suspected COVID-19

PLoS One. 2022 Jan 5;17(1):e0262193. doi: 10.1371/journal.pone.0262193. eCollection 2022.

Abstract

Objective: To prospectively evaluate a logistic regression-based machine learning (ML) prognostic algorithm implemented in real-time as a clinical decision support (CDS) system for symptomatic persons under investigation (PUI) for Coronavirus disease 2019 (COVID-19) in the emergency department (ED).

Methods: We developed in a 12-hospital system a model using training and validation followed by a real-time assessment. The LASSO guided feature selection included demographics, comorbidities, home medications, vital signs. We constructed a logistic regression-based ML algorithm to predict "severe" COVID-19, defined as patients requiring intensive care unit (ICU) admission, invasive mechanical ventilation, or died in or out-of-hospital. Training data included 1,469 adult patients who tested positive for Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) within 14 days of acute care. We performed: 1) temporal validation in 414 SARS-CoV-2 positive patients, 2) validation in a PUI set of 13,271 patients with symptomatic SARS-CoV-2 test during an acute care visit, and 3) real-time validation in 2,174 ED patients with PUI test or positive SARS-CoV-2 result. Subgroup analysis was conducted across race and gender to ensure equity in performance.

Results: The algorithm performed well on pre-implementation validations for predicting COVID-19 severity: 1) the temporal validation had an area under the receiver operating characteristic (AUROC) of 0.87 (95%-CI: 0.83, 0.91); 2) validation in the PUI population had an AUROC of 0.82 (95%-CI: 0.81, 0.83). The ED CDS system performed well in real-time with an AUROC of 0.85 (95%-CI, 0.83, 0.87). Zero patients in the lowest quintile developed "severe" COVID-19. Patients in the highest quintile developed "severe" COVID-19 in 33.2% of cases. The models performed without significant differences between genders and among race/ethnicities (all p-values > 0.05).

Conclusion: A logistic regression model-based ML-enabled CDS can be developed, validated, and implemented with high performance across multiple hospitals while being equitable and maintaining performance in real-time validation.

Publication types

  • Observational Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • COVID-19 / diagnosis*
  • COVID-19 / physiopathology
  • Decision Support Systems, Clinical*
  • Emergency Service, Hospital
  • Humans
  • Logistic Models*
  • Machine Learning*
  • ROC Curve
  • Severity of Illness Index
  • Triage / methods*