Optimizing A syndromic surveillance text classifier for influenza-like illness: Does document source matter?

Brett R South; Wendy W Chapman; Sylvain Delisle; Shuying Shen; Ericka Kalp; Trish Perl; Matthew H Samore; Adi V Gundlapalli

Optimizing A syndromic surveillance text classifier for influenza-like illness: Does document source matter?

AMIA Annu Symp Proc. 2008 Nov 6:2008:692-6.

Authors

Brett R South¹, Wendy W Chapman, Sylvain Delisle, Shuying Shen, Ericka Kalp, Trish Perl, Matthew H Samore, Adi V Gundlapalli

Affiliation

¹ VA Salt Lake City Health Care System, Department of Internal Medicine, University of Utah School of Medicine, USA.

PMID: 18999051
PMCID: PMC2655960

Abstract

Syndromic surveillance systems that incorporate electronic free-text data have primarily focused on extracting concepts of interest from chief complaint text, emergency department visit notes, and nurse triage notes. Due to availability and access, there has been limited work in the area of surveilling the full text of all electronic note documents compared with more specific document sources. This study provides an evaluation of the performance of a text classifier for detection of influenza-like illness (ILI) by document sources that are commonly used for biosurveillance by comparing them to routine visit notes, and a full electronic note corpus approach. Evaluating the performance of an automated text classifier for syndromic surveillance by source document will inform decisions regarding electronic textual data sources for potential use by automated biosurveillance systems. Even when a full electronic medical record is available, commonly available surveillance source documents provide acceptable statistical performance for automated ILI surveillance.

Publication types

Research Support, U.S. Gov't, P.H.S.

MeSH terms

Algorithms
Artificial Intelligence
Documentation / methods*
Humans
Influenza, Human / diagnosis*
Influenza, Human / epidemiology
Information Storage and Retrieval / methods*
Medical History Taking / statistics & numerical data*
Medical Records Systems, Computerized / statistics & numerical data*
Natural Language Processing*
Pattern Recognition, Automated / methods*
Population Surveillance / methods*
Reproducibility of Results
Sensitivity and Specificity
Syndrome
United States
Vocabulary, Controlled

Abstract

Publication types

MeSH terms

Grants and funding