A unified powerful set-based test for sequencing data analysis of GxE interactions

Biostatistics. 2017 Jan;18(1):119-131. doi: 10.1093/biostatistics/kxw034. Epub 2016 Jul 28.

Abstract

The development of next-generation sequencing technologies has allowed researchers to study comprehensively the contribution of genetic variation particularly rare variants to complex diseases. To date many sequencing analyses of rare variants have focused on marginal genetic effects and have not explored the potential role environmental factors play in modifying genetic risk. Analysis of gene-environment interaction (GxE) for rare variants poses considerable challenges because of variant rarity and paucity of subjects who carry the variants while being exposed. To tackle this challenge, we propose a hierarchical model to jointly assess the GxE effects of a set of rare variants for example, in a gene or regulatory region, leveraging the information across the variants. Under this model, GxE is modeled by two components. The first component incorporates variant functional information as weights to calculate the weighted burden of variant alleles across variants, and then assess their GxE interaction with the environmental factor. Since this information is a priori known, this component is fixed effects in the model. The second component involves residual GxE effects that have not been accounted for by the fixed effects. In this component, the residual GxE effects are postulated to follow an unspecified distribution with mean 0 and variance [Formula: see text] We develop a novel testing procedure by deriving two independent score statistics for the fixed effects and the variance component separately. We propose two data-adaptive combination approaches for combining these two score statistics and establish the asymptotic distributions. An extensive simulation study shows that the proposed approaches maintain the correct type I error and the power is comparable to or better than existing methods under a wide range of scenarios. Finally we illustrate the proposed methods by a exome-wide GxE analysis with NSAIDs use in colorectal cancer.

Keywords: Burden and variance component tests; Colorectal cancer; Kernel machine; Rare genetic variants; Score test.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Gene-Environment Interaction*
  • Humans
  • Models, Genetic*
  • Models, Statistical*
  • Sequence Analysis, DNA / statistics & numerical data*