Food entries in a large allergy data repository

J Am Med Inform Assoc. 2016 Apr;23(e1):e79-87. doi: 10.1093/jamia/ocv128. Epub 2015 Sep 17.

Abstract

Objective: Accurate food adverse sensitivity documentation in electronic health records (EHRs) is crucial to patient safety. This study examined, encoded, and grouped foods that caused any adverse sensitivity in a large allergy repository using natural language processing and standard terminologies.

Methods: Using the Medical Text Extraction, Reasoning, and Mapping System (MTERMS), we processed both structured and free-text entries stored in an enterprise-wide allergy repository (Partners' Enterprise-wide Allergy Repository), normalized diverse food allergen terms into concepts, and encoded these concepts using the Systematized Nomenclature of Medicine - Clinical Terms (SNOMED-CT) and Unique Ingredient Identifiers (UNII) terminologies. Concept coverage also was assessed for these two terminologies. We further categorized allergen concepts into groups and calculated the frequencies of these concepts by group. Finally, we conducted an external validation of MTERMS's performance when identifying food allergen terms, using a randomized sample from a different institution.

Results: We identified 158 552 food allergen records (2140 unique terms) in the Partners repository, corresponding to 672 food allergen concepts. High-frequency groups included shellfish (19.3%), fruits or vegetables (18.4%), dairy (9.0%), peanuts (8.5%), tree nuts (8.5%), eggs (6.0%), grains (5.1%), and additives (4.7%). Ambiguous, generic concepts such as "nuts" and "seafood" accounted for 8.8% of the records. SNOMED-CT covered more concepts than UNII in terms of exact (81.7% vs 68.0%) and partial (14.3% vs 9.7%) matches.

Discussion: Adverse sensitivities to food are diverse, and existing standard terminologies have gaps in their coverage of the breadth of allergy concepts.

Conclusion: New strategies are needed to represent and standardize food adverse sensitivity concepts, to improve documentation in EHRs.

Keywords: allergy and immunology; controlled; electronic health records; food hypersensitivity; natural language processing; systematized nomenclature of medicine; vocabulary.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Allergens
  • Databases as Topic*
  • Food Hypersensitivity*
  • Humans
  • Natural Language Processing
  • Systematized Nomenclature of Medicine
  • Terminology as Topic*
  • Vocabulary, Controlled

Substances

  • Allergens