Natural language processing for automated annotation of medication mentions in primary care visit conversations

JAMIA Open. 2021 Aug 19;4(3):ooab071. doi: 10.1093/jamiaopen/ooab071. eCollection 2021 Jul.

Abstract

Objectives: The objective of this study is to build and evaluate a natural language processing approach to identify medication mentions in primary care visit conversations between patients and physicians.

Materials and methods: Eight clinicians contributed to a data set of 85 clinic visit transcripts, and 10 transcripts were randomly selected from this data set as a development set. Our approach utilizes Apache cTAKES and Unified Medical Language System controlled vocabulary to generate a list of medication candidates in the transcribed text and then performs multiple customized filters to exclude common false positives from this list while including some additional common mentions of the supplements and immunizations.

Results: Sixty-five transcripts with 1121 medication mentions were randomly selected as an evaluation set. Our proposed method achieved an F-score of 85.0% for identifying the medication mentions in the test set, significantly outperforming existing medication information extraction systems for medical records with F-scores ranging from 42.9% to 68.9% on the same test set.

Discussion: Our medication information extraction approach for primary care visit conversations showed promising results, extracting about 27% more medication mentions from our evaluation set while eliminating many false positives in comparison to existing baseline systems. We made our approach publicly available on the web as an open-source software.

Conclusion: Integration of our annotation system with clinical recording applications has the potential to improve patients' understanding and recall of key information from their clinic visits, and, in turn, to positively impact health outcomes.

Keywords: clinic visit recording; medication information extraction; natural language processing.