Practitioners are interested in not only the average causal effect of the treatment on the outcome but also the underlying causal mechanism in the presence of an intermediate variable between the treatment and outcome. However, in many cases we cannot randomize the intermediate variable, resulting in sample selection problems even in randomized experiments.Therefore, we view randomized experiments with intermediate variables as semi-observational studies. In parallel with the analysis of observational studies, we provide a theoretical foundation for conducting objective causal inference with an intermediate variable under the principal stratification framework, with principal strata defined as the joint potential values of the intermediate variable. Our strategy constructs weighted samples based on principal scores, defined as the conditional probabilities of the latent principal strata given covariates, without access to any outcome data. This principal stratification analysis yields robust causal inference without relying on any model assumptions on the outcome distributions. We also propose approaches to conducting sensitivity analysis for violations of the ignorability and monotonicity assumptions, the very crucial but untestable identification assumptions in our theory. When the assumptions required by the classical instrumental variable analysis cannot be justified by background knowledge or cannot be made because of scientific questions of interest, our strategy serves as a useful alternative tool to deal with intermediate variables. We illustrate our methodologies by using two real data examples, and find scientifically meaningful conclusions. (Angrist et al. 1996). When the intermediate variable is the indicator for survival status, the only sensible subgroup causal effect on the outcome is the one for survivors who would potentially survive under both treatment and control (Rubin 2006). When the intermediate variable is a surrogate for the outcome, we want to predict the causal effect on the outcome by the causal effect on the surrogate. An ideal surrogate must satisfy the causal necessity that zero effect on the surrogate implies zero effect on the outcome (Frangakis and Rubin 2002) and the causal sufficiency that positive effect on the surrogate implies positive effect on the outcome (Gilbert and Hudgens 2008). Therefore, we can assess these requirements for an ideal surrogate by conducting a principal stratification analysis.
Principal stratification clarifies causal inference with intermediate variables, but it also resultsin inferential difficulties because of the missingness of the principal stratification variable and the consequential mixture distributions of the observed data. We can sharpen inference about causal effects within principal strata only if we impose some of the following structural or modeling assumptions: (1) monotonicity that the treatment has a nonnegative effect on the intermediate variable for each unit (e.g., Angrist et al. 1996;Gilbert and Hudgens 2008); (2) et ...