An important problem in many domains is to predict how a system will respond to interventions. This task is inherently linked to estimating the system's underlying causal structure. To this end, Invariant Causal Prediction (ICP) (Peters et al., 2016) has been proposed which learns a causal model exploiting the invariance of causal relations using data from different environments. When considering linear models, the implementation of ICP is relatively straightforward. However, the nonlinear case is more challenging due to the difficulty of performing nonparametric tests for conditional independence.In this work, we present and evaluate an array of methods for nonlinear and nonparametric versions of ICP for learning the causal parents of given target variables. We find that an approach which first fits a nonlinear model with data pooled over all environments and then tests for differences between the residual distributions across environments is quite robust across a large variety of simulation settings. We call this procedure "invariant residual distribution test". In general, we observe that the performance of all approaches is critically dependent on the true (unknown) causal structure and it becomes challenging to achieve high power if the parental set includes more than two variables.As a real-world example, we consider fertility rate modeling which is central to world population projections. We explore predicting the effect of hypothetical interventions using the accepted models from nonlinear ICP. The results reaffirm the previously observed central causal role of child mortality rates. arXiv:1706.08576v2 [stat.ME] 19 Sep 2018 Structural causal modelsAssume an underlying structural causal model (also called structural equation model) (e.g. Pearl, 2009)
Graphical models can represent a multivariate distribution in a convenient and accessible form as a graph. Causal models can be viewed as a special class of graphical models that not only represent the distribution of the observed system but also the distributions under external interventions. They hence enable predictions under hypothetical interventions, which is important for decision making. The challenging task of learning causal models from data always relies on some underlying assumptions. We discuss several recently proposed structure learning algorithms and their assumptions, and compare their empirical performance under various scenarios.
When training a deep neural network for image classification, one can broadly distinguish between two types of latent features of images that will drive the classification. We can divide latent features into (i) 'core' or 'conditionally invariant' features X core whose distribution X core |Y , conditional on the class Y , does not change substantially across domains and (ii) 'style' features X style whose distribution X style |Y can change substantially across domains. Examples for style features include position, rotation, image quality or brightness but also more complex ones like hair color, image quality or posture for images of persons. Our goal is to minimize a loss that is robust under changes in the distribution of these style features. In contrast to previous work, we assume that the domain itself is not observed and hence a latent variable.We do assume that we can sometimes observe a typically discrete identifier or "ID variable". In some applications we know, for example, that two images show the same person, and ID then refers to the identity of the person. The proposed method requires only a small fraction of images to have ID information. We group observations if they share the same class and identifier (Y, ID) = (y, id) and penalize the conditional variance of the prediction or the loss if we condition on (Y, ID). Using a causal framework, this conditional variance regularization (CoRe) is shown to protect asymptotically against shifts in the distribution of the style variables. Empirically, we show that the CoRe penalty improves predictive accuracy substantially in settings where domain changes occur in terms of image quality, brightness and color while we also look at more complex changes such as changes in movement and posture.
When training a deep neural network for image classification, one can broadly distinguish between two types of latent features of images that will drive the classification. We can divide latent features into (i) ‘core’ or ‘conditionally invariant’ features $$C$$ C whose distribution $$C\vert Y$$ C | Y , conditional on the class Y, does not change substantially across domains and (ii) ‘style’ features $$S$$ S whose distribution $$S\vert Y$$ S | Y can change substantially across domains. Examples for style features include position, rotation, image quality or brightness but also more complex ones like hair color, image quality or posture for images of persons. Our goal is to minimize a loss that is robust under changes in the distribution of these style features. In contrast to previous work, we assume that the domain itself is not observed and hence a latent variable. We do assume that we can sometimes observe a typically discrete identifier or “$$\mathrm {ID}$$ ID variable”. In some applications we know, for example, that two images show the same person, and $$\mathrm {ID}$$ ID then refers to the identity of the person. The proposed method requires only a small fraction of images to have $$\mathrm {ID}$$ ID information. We group observations if they share the same class and identifier $$(Y,\mathrm {ID})=(y,\mathrm {id})$$ ( Y , ID ) = ( y , id ) and penalize the conditional variance of the prediction or the loss if we condition on $$(Y,\mathrm {ID})$$ ( Y , ID ) . Using a causal framework, this conditional variance regularization (CoRe) is shown to protect asymptotically against shifts in the distribution of the style variables in a partially linear structural equation model. Empirically, we show that the CoRe penalty improves predictive accuracy substantially in settings where domain changes occur in terms of image quality, brightness and color while we also look at more complex changes such as changes in movement and posture.
Learning the causal relationships that define a molecular system allows us to predict how the system will respond to different interventions. Distinguishing causality from mere association typically requires randomized experiments. Methods for automated causal discovery from limited experiments exist, but have so far rarely been tested in systems biology applications. In this work, we apply state-of-the art causal discovery methods on a large collection of public mass cytometry data sets, measuring intra-cellular signaling proteins of the human immune system and their response to several perturbations. We show how different experimental conditions can be used to facilitate causal discovery, and apply two fundamental methods that produce context-specific causal predictions. Causal predictions were reproducible across independent data sets from two different studies, but often disagree with the KEGG pathway databases. Within this context, we discuss the caveats we need to overcome for automated causal discovery to become a part of the routine data analysis in systems biology.
A fundamental difficulty of causal learning is that causal models can generally not be fully identified based on observational data only. Interventional data, that is, data originating from different experimental environments, improves identifiability. However, the improvement depends critically on the target and nature of the interventions carried out in each experiment. Since in real applications experiments tend to be costly, there is a need to perform the right interventions such that as few as possible are required.In this work we propose a new active learning (i.e. experiment selection) framework (A-ICP) based on Invariant Causal Prediction (ICP) [27]. For general structural causal models, we characterize the effect of interventions on so-called stable sets, a notion introduced by [30]. We leverage these results to propose several intervention selection policies for A-ICP which quickly reveal the direct causes of a response variable in the causal graph while maintaining the error control inherent in ICP. Empirically, we analyze the performance of the proposed policies in both population and finite-regime experiments.
Abstract. A key challenge in climate science is to quantify the forced response in impact-relevant variables such as precipitation against the background of internal variability, both in models and observations. Dynamical adjustment techniques aim to remove unforced variability from a target variable by identifying patterns associated with circulation, thus effectively acting as a filter for dynamically induced variability. The forced contributions are interpreted as the variation that is unexplained by circulation. However, dynamical adjustment of precipitation at local scales remains challenging because of large natural variability and the complex, nonlinear relationship between precipitation and circulation particularly in heterogeneous terrain. Building on variational autoencoders, we introduce a novel statistical model – the Latent Linear Adjustment Autoencoder (LLAAE) – that enables estimation of the contribution of a coarse-scale atmospheric circulation proxy to daily precipitation at high resolution and in a spatially coherent manner. To predict circulation-induced precipitation, the Latent Linear Adjustment Autoencoder combines a linear component, which models the relationship between circulation and the latent space of an autoencoder, with the autoencoder's nonlinear decoder. The combination is achieved by imposing an additional penalty in the cost function that encourages linearity between the circulation field and the autoencoder's latent space, hence leveraging robustness advantages of linear models as well as the flexibility of deep neural networks. We show that our model predicts realistic daily winter precipitation fields at high resolution based on a 50-member ensemble of the Canadian Regional Climate Model at 12 km resolution over Europe, capturing, for instance, key orographic features and geographical gradients. Using the Latent Linear Adjustment Autoencoder to remove the dynamic component of precipitation variability, forced thermodynamic components are expected to remain in the residual, which enables the uncovering of forced precipitation patterns of change from just a few ensemble members. We extend this to quantify the forced pattern of change conditional on specific circulation regimes. Future applications could include, for instance, weather generators emulating climate model simulations of regional precipitation, detection and attribution at subcontinental scales, or statistical downscaling and transfer learning between models and observations to exploit the typically much larger sample size in models compared to observations.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.