Evaluating sensitivity to classification uncertainty in latent subgroup effect analyses

Wen Wei Loh; Jee-Seon Kim

doi:10.1186/s12874-022-01720-8

Evaluating sensitivity to classification uncertainty in latent subgroup effect analyses

BMC Med Res Methodol. 2022 Sep 24;22(1):247. doi: 10.1186/s12874-022-01720-8.

Authors

Wen Wei Loh^{1

2}, Jee-Seon Kim³

Affiliations

¹ Department of Data Analysis, Ghent University, Gent, Belgium. wen.wei.loh@emory.edu.
² Department of Quantitative Theory and Methods, Emory University, Atlanta, GA, USA. wen.wei.loh@emory.edu.
³ Department of Educational Psychology, University of Wisconsin-Madison, Madison, Wisconsin, USA.

Abstract

Background: Increasing attention is being given to assessing treatment effect heterogeneity among individuals belonging to qualitatively different latent subgroups. Inference routinely proceeds by first partitioning the individuals into subgroups, then estimating the subgroup-specific average treatment effects. However, because the subgroups are only latently associated with the observed variables, the actual individual subgroup memberships are rarely known with certainty in practice and thus have to be imputed. Ignoring the uncertainty in the imputed memberships precludes misclassification errors, potentially leading to biased results and incorrect conclusions.

Methods: We propose a strategy for assessing the sensitivity of inference to classification uncertainty when using such classify-analyze approaches for subgroup effect analyses. We exploit each individual's typically nonzero predictive or posterior subgroup membership probabilities to gauge the stability of the resultant subgroup-specific average causal effects estimates over different, carefully selected subsets of the individuals. Because the membership probabilities are subject to sampling variability, we propose Monte Carlo confidence intervals that explicitly acknowledge the imprecision in the estimated subgroup memberships via perturbations using a parametric bootstrap. The proposal is widely applicable and avoids stringent causal or structural assumptions that existing bias-adjustment or bias-correction methods rely on.

Results: Using two different publicly available real-world datasets, we illustrate how the proposed strategy supplements existing latent subgroup effect analyses to shed light on the potential impact of classification uncertainty on inference. First, individuals are partitioned into latent subgroups based on their medical and health history. Then within each fixed latent subgroup, the average treatment effect is assessed using an augmented inverse propensity score weighted estimator. Finally, utilizing the proposed sensitivity analysis reveals different subgroup-specific effects that are mostly insensitive to potential misclassification.

Conclusions: Our proposed sensitivity analysis is straightforward to implement, provides both graphical and numerical summaries, and readily permits assessing the sensitivity of any machine learning-based causal effect estimator to classification uncertainty. We recommend making such sensitivity analyses more routine in latent subgroup effect analyses.

Keywords: Causal inference; Finite mixture models; Latent class analysis; Parametric bootstrap; Perturbed confidence interval; Sensitivity analysis; Subgroup average treatment effect (ATE).

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Bias
Causality
Humans
Monte Carlo Method
Propensity Score
Uncertainty*