Two-sample test for correlated data under outcome-dependent sampling with an application to self-reported weight loss data

Stat Med. 2019 Nov 10;38(25):4999-5009. doi: 10.1002/sim.8346. Epub 2019 Sep 5.

Abstract

Standard methods for two-sample tests such as the t-test and Wilcoxon rank sum test may lead to incorrect type I errors when applied to longitudinal or clustered data. Recent alternatives of two-sample tests for clustered data often require certain assumptions on the correlation structure and/or noninformative cluster size. In this paper, based on a novel pseudolikelihood for correlated data, we propose a score test without knowledge of the correlation structure or assuming data missingness at random. The proposed score test can capture differences in the mean and variance between two groups simultaneously. We use projection theory to derive the limiting distribution of the test statistic, in which the covariance matrix can be empirically estimated. We conduct simulation studies to evaluate the proposed test and compare it with existing methods. To illustrate the usefulness proposed test, we use it to compare self-reported weight loss data in a friends' referral group, with the data from the Internet self-joining group.

Keywords: U-statistics; correlated data; outcome-dependent sampling; pseudolikelihood; two-sample test.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biometry / methods*
  • Cluster Analysis
  • Computer Simulation
  • Humans
  • Internet
  • Longitudinal Studies
  • Self Report*
  • Weight Loss*