Learning Markov boundaries from data without having to learn a Bayesian network first can be viewed as a feature subset selection problem and has received much attention due to its significance in the wide applications of AI techniques. Popular constraint based methods suffer from high computational complexity and are usually unstable in spaces of high dimensionality. We propose a new perspective from matroid theory towards the discovery of Markov boundaries of random variable in the domain, and develop a learning algorithm which guarantees to recover the true Markov boundaries by a greedy learning algorithm. Then we use the precision matrix of the original distribution as a measure of independence to make our algorithm feasible in large scale problems, which is essentially an approximation of the probabilistic relations with Gaussians and can find possible variables in Markov boundaries with low computational complexity. Experimental results on standard Bayesian networks show that our analysis and approximation can efficiently and accurately identify Markov boundaries in complex networks from data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.