Cluster Randomized Trials Designed to Support Generalizable Inferences

Sarah E Robertson; Jon A Steingrimsson; Issa J Dahabreh

doi:10.1177/0193841X231169557

Cluster Randomized Trials Designed to Support Generalizable Inferences

Eval Rev. 2024 Jan 17:193841X231169557. doi: 10.1177/0193841X231169557. Online ahead of print.

Authors

Sarah E Robertson^{1

2}, Jon A Steingrimsson³, Issa J Dahabreh^{1

2

4}

Affiliations

¹ CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
² Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
³ Department of Biostatistics, Brown University School of Public Health, Providence, RI, USA.
⁴ Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

PMID: 38234059
DOI: 10.1177/0193841X231169557

Abstract

When planning a cluster randomized trial, evaluators often have access to an enumerated cohort representing the target population of clusters. Practicalities of conducting the trial, such as the need to oversample clusters with certain characteristics in order to improve trial economy or support inferences about subgroups of clusters, may preclude simple random sampling from the cohort into the trial, and thus interfere with the goal of producing generalizable inferences about the target population. We describe a nested trial design where the randomized clusters are embedded within a cohort of trial-eligible clusters from the target population and where clusters are selected for inclusion in the trial with known sampling probabilities that may depend on cluster characteristics (e.g., allowing clusters to be chosen to facilitate trial conduct or to examine hypotheses related to their characteristics). We develop and evaluate methods for analyzing data from this design to generalize causal inferences to the target population underlying the cohort. We present identification and estimation results for the expectation of the average potential outcome and for the average treatment effect, in the entire target population of clusters and in its non-randomized subset. In simulation studies, we show that all the estimators have low bias but markedly different precision. Cluster randomized trials where clusters are selected for inclusion with known sampling probabilities that depend on cluster characteristics, combined with efficient estimation methods, can precisely quantify treatment effects in the target population, while addressing objectives of trial conduct that require oversampling clusters on the basis of their characteristics.

Keywords: causal inference; cluster randomized trials; design; generalizability; interference; transportability.