Testing the missing at random assumption in generalized linear models in the presence of instrumental variables

Scand Stat Theory Appl. 2024 Mar;51(1):334-354. doi: 10.1111/sjos.12685. Epub 2023 Aug 7.

Abstract

Practical problems with missing data are common, and many methods have been developed concerning the validity and/or efficiency of statistical procedures. On a central focus, there have been longstanding interests on the mechanism governing data missingness, and correctly deciding the appropriate mechanism is crucially relevant for conducting proper practical investigations. In this paper, we present a new hypothesis testing approach for deciding between the conventional notions of missing at random and missing not at random in generalized linear models in the presence of instrumental variables. The foundational idea is to develop appropriate discrepancy measures between estimators whose properties significantly differ only when missing at random does not hold. We show that our testing approach achieves an objective data-oriented choice between missing at random or not. We demonstrate the feasibility, validity, and efficacy of the new test by theoretical analysis, simulation studies, and a real data analysis.

Keywords: Hausman test; hypothesis testing; influence function; instrumental variable; missing not at random; semiparametric inference.