We propose a new criterion for confounder selection when the underlying causal structure is unknown and only limited knowledge is available. We assume all covariates being considered are pretreatment variables and that for each covariate it is known (i) whether the covariate is a cause of treatment, and (ii) whether the covariate is a cause of the outcome. The causal relationships the covariates have with one another is assumed unknown. We propose that control be made for any covariate that is either a cause of treatment or of the outcome or both. We show that irrespective of the actual underlying causal structure, if any subset of the observed covariates suffices to control for confounding then the set of covariates chosen by our criterion will also suffice. We show that other, commonly used, criteria for confounding control do not have this property. We use formal theory concerning causal diagrams to prove our result but the application of the result does not rely on familiarity with causal diagrams. An investigator simply need ask, “Is the covariate a cause of the treatment?” and “Is the covariate a cause of the outcome?” If the answer to either question is “yes” then the covariate is included for confounder control. We discuss some additional covariate selection results that preserve unconfoundedness and that may be of interest when used with our criterion.
Whilst estimation of the marginal (total) causal effect of a point exposure on an outcome is arguably the most common objective of experimental and observational studies in the health and social sciences, in recent years, investigators have also become increasingly interested in mediation analysis. Specifically, upon evaluating the total effect of the exposure, investigators routinely wish to make inferences about the direct or indirect pathways of the effect of the exposure not through or through a mediator variable that occurs subsequently to the exposure and prior to the outcome. Although powerful semiparametric methodologies have been developed to analyze observational studies, that produce double robust and highly efficient estimates of the marginal total causal effect, similar methods for mediation analysis are currently lacking. Thus, this paper develops a general semiparametric framework for obtaining inferences about so-called marginal natural direct and indirect causal effects, while appropriately accounting for a large number of pre-exposure confounding factors for the exposure and the mediator variables. Our analytic framework is particularly appealing, because it gives new insights on issues of efficiency and robustness in the context of mediation analysis. In particular, we propose new multiply robust locally efficient estimators of the marginal natural indirect and direct causal effects, and develop a novel double robust sensitivity analysis framework for the assumption of ignorability of the mediator variable.
Questions concerning mediated causal effects are of great interest in psychology, cognitive science, medicine, social science, public health, and many other disciplines. For instance, about 60% of recent papers published in leading journals in social psychology contain at least one mediation test (Rucker, Preacher, Tormala, & Petty, 2011). Standard parametric approaches to mediation analysis employ regression models, and either the "difference method" (Judd & Kenny, 1981), more common in epidemiology, or the "product method" (Baron & Kenny, 1986), more common in the social sciences. In this article, we first discuss a known, but perhaps often unappreciated, fact that these parametric approaches are a special case of a general counterfactual framework for reasoning about causality first described by Neyman (1923) and Rubin (1974) and linked to causal graphical models by Robins (1986) andPearl (2000). We then show a number of advantages of this framework. First, it makes the strong assumptions underlying mediation analysis explicit. Second, it avoids a number of problems present in the product and difference methods, such as biased estimates of effects in certain cases. Finally, we show the generality of this framework by proving a novel result which allows mediation analysis to be applied to longitudinal settings with unobserved confounders.Keywords: Causal inference; Counterfactuals; Mediation analysis; Longitudinal studies; Direct and indirect effects; Path-specific effects; Graphical modelsThe aim of empirical research in many disciplines is establishing the presence of effects by means of either randomized trials or observational studies if randomization is not possible. For example, a celebrated success of empirical research in epidemiology is the discovery of a causal connection between smoking and lung cancer (Doll & Hill, 1950).Once the presence of an effect is established, the precise mechanism of the effect becomes a topic of interest as well. A particularly popular type of mechanism analysis Correspondence should be sent to Ilya Shpitser, Mathematical Sciences,
Summary The causal inference literature has provided a clear formal definition of confounding expressed in terms of counterfactual independence. The causal inference literature has not, however, produced a clear formal definition of a confounder, as it has given priority to the concept of confounding over that of a confounder. We consider a number of candidate definitions arising from various more informal statements made in the literature. We consider the properties satisfied by each candidate definition, principally focusing on (i) whether under the candidate definition control for all “confounders” suffices to control for “confounding” and (ii) whether each confounder in some context helps eliminate or reduce confounding bias. Several of the candidate definitions do not have these two properties. Only one candidate definition of those considered satisfies both properties. We propose that a “confounder” be defined as a pre-exposure covariate C for which there exists a set of other covariates X such that effect of the exposure on the outcome is unconfounded conditional on (X, C) but such that for no proper subset of (X, C) is the effect of the exposure on the outcome unconfounded given the subset. A variable that helps reduce bias but not eliminate bias we propose referring to as a “surrogate confounder.”
Methods for inferring average causal effects have traditionally relied on two key assumptions:(i) the intervention received by one unit cannot causally influence the outcome of another; and (ii) units can be organized into non-overlapping groups such that outcomes of units in separate groups are independent. In this paper, we develop new statistical methods for causal inference based on a single realization of a network of connected units for which neither assumption (i) nor (ii) holds. The proposed approach allows both for arbitrary forms of interference, whereby the outcome of a unit may depend on interventions received by other units with whom a network path through connected units exists; and long range dependence, whereby outcomes for any two units likewise connected by a path in the network may be dependent. Under network versions of consistency and no unobserved confounding, inference is made tractable by an assumption that the networks outcome, treatment and covariate vectors are a single realization of a certain chain graph model. This assumption allows inferences about various network causal effects via the auto-g-computation algorithm, a network generalization of Robins' well-known g-computation algorithm previously described for causal inference under assumptions (i) and (ii).
In this paper, we consider the problem of fair statistical inference involving outcome variables. Examples include classification and regression problems, and estimating treatment effects in randomized trials or observational data. The issue of fairness arises in such problems where some covariates or treatments are "sensitive," in the sense of having potential of creating discrimination. In this paper, we argue that the presence of discrimination can be formalized in a sensible way as the presence of an effect of a sensitive covariate on the outcome along certain causal pathways, a view which generalizes (Pearl 2009). A fair outcome model can then be learned by solving a constrained optimization problem. We discuss a number of complications that arise in classical statistical inference due to this view and provide workarounds based on recent work in causal and semi-parametric inference.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.