Recognizing Questions and Answers in EMR Templates Using Natural Language Processing

Guy Divita; Shuying Shen; Marjorie E Carter; Andrew Redd; Tyler Forbush; Miland Palmer; Matthew H Samore; Adi V Gundlapalli

Recognizing Questions and Answers in EMR Templates Using Natural Language Processing

Stud Health Technol Inform. 2014:202:149-52.

Authors

Guy Divita¹, Shuying Shen¹, Marjorie E Carter¹, Andrew Redd¹, Tyler Forbush², Miland Palmer¹, Matthew H Samore¹, Adi V Gundlapalli¹

Affiliations

¹ VA Salt Lake City Health Care System, Salt Lake City, Utah, USA.
² University of Utah School of Medicine, Salt Lake City, Utah, USA.

PMID: 25000038

Abstract

Templated boilerplate structures pose challenges to natural language processing (NLP) tools used for information extraction (IE). Routine error analyses while performing an IE task using Veterans Affairs (VA) medical records identified templates as an important cause of false positives. The baseline NLP pipeline (V3NLP) was adapted to recognize negation, questions and answers (QA) in various template types by adding a negation and slot:value identification annotator. The system was trained using a corpus of 975 documents developed as a reference standard for extracting psychosocial concepts. Iterative processing using the baseline tool and baseline+negation+QA revealed loss of numbers of concepts with a modest increase in true positives in several concept categories. Similar improvement was noted when the adapted V3NLP was used to process a random sample of 318,000 notes. We demonstrate the feasibility of adapting an NLP pipeline to recognize templates.

MeSH terms

Data Mining / methods*
Electronic Health Records / classification*
Electronic Health Records / organization & administration*
Forms and Records Control / methods*
Machine Learning
Natural Language Processing*
Reproducibility of Results
Semantics
Sensitivity and Specificity
Vocabulary, Controlled*