Main approaches for learning Bayesian networks can be classified as constraint-based, score-based or hybrid methods. Although highdimensional consistency results are available for constraint-based methods like the PC algorithm, such results have not been proved for scorebased or hybrid methods, and most of the hybrid methods have not even shown to be consistent in the classical setting where the number of variables remains fixed and the sample size tends to infinity. In this paper, we show that consistency of hybrid methods based on greedy equivalence search (GES) can be achieved in the classical setting with adaptive restrictions on the search space that depend on the current state of the algorithm. Moreover, we prove consistency of GES and adaptively restricted GES (ARGES) in several sparse high-dimensional settings. ARGES scales well to sparse graphs with thousands of variables and our simulation study indicates that both GES and ARGES generally outperform the PC algorithm.
The lung cancer dataset is available from Gene Expression Omnibus (accession, GSE43580). The maPredictDSC R package implementing the approach of the best overall team is available at www.bioconductor.org or http://bioinformaticsprb.med.wayne.edu/.
We consider the estimation of joint causal effects from observational data. In particular, we propose new methods to estimate the effect of multiple simultaneous interventions (e.g., multiple gene knockouts), under the assumption that the observational data come from an unknown linear structural equation model with independent errors. We derive asymptotic variances of our estimators when the underlying causal structure is partly known, as well as high-dimensional consistency when the causal structure is fully unknown and the joint distribution is multivariate Gaussian. We also propose a generalization of our methodology to the class of nonparanormal distributions. We evaluate the estimators in simulation studies and also illustrate them on data from the DREAM4 challenge.
Summary
We introduce a new method to estimate the Markov equivalence class of a directed acyclic graph (DAG) in the presence of hidden variables, in settings where the underlying DAG among the observed variables is sparse, and there are a few hidden variables that have a direct effect on many of the observed variables. Building on the so‐called low rank plus sparse framework, we suggest a two‐stage approach which first removes the effect of the hidden variables and then estimates the Markov equivalence class of the underlying DAG under the assumption that there are no remaining hidden variables. This approach is consistent in certain high dimensional regimes and performs favourably when compared with the state of the art, in terms of both graphical structure recovery and total causal effect estimation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.