Deep neural networks with knockoff features identify nonlinear causal relations and estimate effect sizes in complex biological systems

Zhenjiang Fan; Kate F Kernan; Aditya Sriram; Panayiotis V Benos; Scott W Canna; Joseph A Carcillo; Soyeon Kim; Hyun Jung Park

doi:10.1093/gigascience/giad044

Deep neural networks with knockoff features identify nonlinear causal relations and estimate effect sizes in complex biological systems

Gigascience. 2022 Dec 28:12:giad044. doi: 10.1093/gigascience/giad044. Epub 2023 Jul 3.

Authors

Zhenjiang Fan¹, Kate F Kernan², Aditya Sriram³, Panayiotis V Benos⁴, Scott W Canna⁵, Joseph A Carcillo², Soyeon Kim^{6

7}, Hyun Jung Park³

Affiliations

¹ Department of Computer Science, University of Pittsburgh, Pittsburgh, PA 15213, USA.
² Division of Pediatric Critical Care Medicine, Department of Critical Care Medicine, Children's Hospital of Pittsburgh, Center for Critical Care Nephrology and Clinical Research Investigation and Systems Modeling of Acute Illness Center, University of Pittsburgh, Pittsburgh, PA 15260,USA.
³ Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA 15213, USA.
⁴ Department of Epidemiology, University of Florida, Gainesville, FL 32610, USA.
⁵ Pediatric Rheumatology, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
⁶ Division of Pediatric Pulmonary Medicine, Children's Hospital of Pittsburgh, Pittsburgh, PA 15224, USA.
⁷ Department of Pediatrics, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15224, USA.

Abstract

Background: Learning the causal structure helps identify risk factors, disease mechanisms, and candidate therapeutics for complex diseases. However, although complex biological systems are characterized by nonlinear associations, existing bioinformatic methods of causal inference cannot identify the nonlinear relationships and estimate their effect size.

Results: To overcome these limitations, we developed the first computational method that explicitly learns nonlinear causal relations and estimates the effect size using a deep neural network approach coupled with the knockoff framework, named causal directed acyclic graphs using deep learning variable selection (DAG-deepVASE). Using simulation data of diverse scenarios and identifying known and novel causal relations in molecular and clinical data of various diseases, we demonstrated that DAG-deepVASE consistently outperforms existing methods in identifying true and known causal relations. In the analyses, we also illustrate how identifying nonlinear causal relations and estimating their effect size help understand the complex disease pathobiology, which is not possible using other methods.

Conclusions: With these advantages, the application of DAG-deepVASE can help identify driver genes and therapeutic agents in biomedical studies and clinical trials.

Keywords: causal inference; deep neural networks; effect size estimation.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Causality
Computer Simulation
Neural Networks, Computer*

Abstract

Publication types

MeSH terms

Grants and funding