Randomized experiments are an excellent tool for estimating internally valid causal effects with the sample at hand, but their external validity is frequently questioned. While classical results on the estimation of Population Average Treatment Effects (PATE) implicitly assume random selection into experiments, this is typically far from true in many medical, socialscientific, and industry experiments. When the experimental sample is different from the target sample along observable or unobservable dimensions (termed covariate shift in the causal learning literature), experimental estimates may be of limited use for policy decisions. We cast this as a sample selection problem and propose methods to re-weight the doublyrobust scores from experimental subjects to estimate treatment effects in the overall sample (=: generalization) or in an alternate target sample (=: transportation). We implement these estimators in the open-source package causalTransportR 1 and illustrate its performance in a simulation study and discuss diagnostics to evaluate its performance.
MethodsWe observe n iid copies of (X i , S i , S i A i , S i Y i ) n i=1 , where covariates X i ∈ R p , treatment A i ∈ A := {0, . . . , K}, outcome Y i ∈ R, and selection indicator S i ∈ {0, 1} is a function of pre-treatment variables and is not affected by treatment. In other words, we observe (X i , A i , Y i ) N 1 i=1 for observations with S i = 1 (henceforth the study sample S 1 ), and only (X i ) N i=N 1 +1 for observations with S i = 0 (henceforth the external sample S 0 ). The overall sample is S := S 1 ∪ S 0 .Estimands. We write counterfactual means as φ = E Y a,S=1 for generalizability and E [Y a |S = 0] for transportability, and contrasts between such counterfactual means under any two treatment levels a, a represent the average treatment effects (ATE). 'Standard' estimation of effects in the study sample under unconfoundedness is a well-studied and largely resolved problem (see [10] for a review). We study the generalization and transportation problems in the present paper. To this end, we make the following assumptions:(1) Consistency / SUTVA :2) Ignorability of Treatment: Y 0 , . . . , Y a ⊥ ⊥ A|X = x, S = 1 (3) Overlap (a) Treatment overlap: 0 < Pr (A = a|X = x, S = 1) < 1 (b) Selection overlap: 0 < Pr