Neural network based integration of assays to assess pathogenic potential

Mohammed Eslami; Yi-Pei Chen; Ainsley C Nicholson; Mark Weston; Melissa Bell; John R McQuiston; James Samuel; Erin J van Schaik; Paul de Figueiredo

doi:10.1038/s41598-023-32950-5

Neural network based integration of assays to assess pathogenic potential

Sci Rep. 2023 Apr 13;13(1):6021. doi: 10.1038/s41598-023-32950-5.

Authors

Mohammed Eslami¹, Yi-Pei Chen², Ainsley C Nicholson³, Mark Weston², Melissa Bell³, John R McQuiston³, James Samuel⁴, Erin J van Schaik⁴, Paul de Figueiredo^{5

6}

Affiliations

¹ Netrias, LLC, 1162 Gateway Drive, Annapolis, MD, 21409, USA. meslami@netrias.com.
² Netrias, LLC, 1162 Gateway Drive, Annapolis, MD, 21409, USA.
³ Special Bacteriology Reference Laboratory, Bacterial Special Pathogens Branch, Division of High-Consequence Pathogens and Pathology, Centers for Disease Control and Prevention, Atlanta, GA, 30333, USA.
⁴ Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, Bryan, TX, 77807, USA.
⁵ Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, Bryan, TX, 77807, USA. pjdefigueiredo@tamu.edu.
⁶ Department of Veterinary Pathobiology, Texas A&M University, College Station, TX, 77843, USA. pjdefigueiredo@tamu.edu.

Abstract

Limited data significantly hinders our capability of biothreat assessment of novel bacterial strains. Integration of data from additional sources that can provide context about the strain can address this challenge. Datasets from different sources, however, are generated with a specific objective and which makes integration challenging. Here, we developed a deep learning-based approach called the neural network embedding model (NNEM) that integrates data from conventional assays designed to classify species with new assays that interrogate hallmarks of pathogenicity for biothreat assessment. We used a dataset of metabolic characteristics from a de-identified set of known bacterial strains that the Special Bacteriology Reference Laboratory (SBRL) of the Centers for Disease Control and Prevention (CDC) has curated for use in species identification. The NNEM transformed results from SBRL assays into vectors to supplement unrelated pathogenicity assays from de-identified microbes. The enrichment resulted in a significant improvement in accuracy of 9% for biothreat. Importantly, the dataset used in our analysis is large, but noisy. Therefore, the performance of our system is expected to improve as additional types of pathogenicity assays are developed and deployed. The proposed NNEM strategy thus provides a generalizable framework for enrichment of datasets with previously collected assays indicative of species.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Bacteria*
Neural Networks, Computer*
United States