A stochastic second-order generalized estimating equations approach for estimating association parameters

J Comput Graph Stat. 2020;29(3):547-561. doi: 10.1080/10618600.2019.1710156. Epub 2020 Feb 7.

Abstract

Design and analysis of cluster randomized trials must take into account the intraclass correlation coefficient (ICC), which quantifies the correlation among outcomes from the same cluster. Second-order generalized estimating equations (GEE2) provides a statistically robust way in estimating this quantity and other association parameters. However, GEE2 becomes computationally infeasible as cluster sizes grow. This paper proposes a stochastic variant to fitting GEE2 which alleviates reliance on parameter starting values and provides substantially faster speeds and higher convergence rates than the widely used deterministic Newton-Raphson method. We also propose new estimators for the ICC which account for informative missing outcome data through the use of GEE2, for which we incorporate a "second-order" inverse probability weighting scheme and "second-order" doubly robust (DR) estimating equations that guard against partial model misspecification. Our proposed methods are evaluated through simulations and applied to data from a cluster randomized trial in Bangladesh evaluating the effect of different marketing interventions on the use of hygienic latrines.

Keywords: Clustered data; GEE2; Robbins-Monro; doubly robust.