Causal inference with noisy data: Bias analysis and estimation approaches to simultaneously addressing missingness and misclassification in binary outcomes

Stat Med. 2020 Feb 20;39(4):456-468. doi: 10.1002/sim.8419. Epub 2019 Dec 5.

Abstract

Causal inference has been widely conducted in various fields and many methods have been proposed for different settings. However, for noisy data with both mismeasurements and missing observations, those methods often break down. In this paper, we consider a problem that binary outcomes are subject to both missingness and misclassification, when the interest is in estimation of the average treatment effects (ATE). We examine the asymptotic biases caused by ignoring missingness and/or misclassification and establish the intrinsic connections between missingness effects and misclassification effects on the estimation of ATE. We develop valid weighted estimation methods to simultaneously correct for missingness and misclassification effects. To provide protection against model misspecification, we further propose a doubly robust correction method which yields consistent estimators when either the treatment model or the outcome model is misspecified. Simulation studies are conducted to assess the performance of the proposed methods. An application to smoking cessation data is reported to illustrate the use of the proposed methods.

Keywords: causal inference; double robustness; inverse probability weighting; misclassification; missing data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bias
  • Causality
  • Computer Simulation
  • Humans
  • Models, Statistical*
  • Models, Theoretical*