We provide finite sample guarantees for the classical Chow-Liu algorithm (IEEE Trans. Inform. Theory, 1968) to learn a tree-structured graphical model of a distribution. For a distribution P on Σ n and a tree T on n nodes, we say T is an ε-approximate tree for P if there is a T -structured distribution Q such that D(P Q) is at most ε more than the best possible tree-structured distribution for P . We show that if P itself is tree-structured, then the Chow-Liu algorithm with the plug-in estimator for mutual information with O(|Σ| 3 nε −1 ) i.i.d. samples outputs an ε-approximate tree for P with constant probability. In contrast, for a general P (which may not be tree-structured), Ω(n 2 ε −2 ) samples are necessary to find an ε-approximate tree. Our upper bound is based on a new conditional independence tester that addresses an open problem posed by Canonne, Diakonikolas, Kane, and Stewart (STOC, 2018): we prove that for three random variables X, Y, Z each over Σ, testing if I(X; Y | Z) is 0 or ≥ ε is possible with O(|Σ| 3 /ε) samples. Finally, we show that for a specific tree T , with O(|Σ| 2 nε −1 ) samples from a distribution P over Σ n , one can efficiently learn the closest T -structured distribution in KL divergence by applying the add-1 estimator at each node. * Höffgen's result is slightly different in that he doesn't use the plug-in estimator for mutual information.
Total variation distance (TV distance) is a fundamental notion of distance between probability distributions. In this work, we introduce and study the problem of computing the TV distance of two product distributions over the domain {0,1}^n. In particular, we establish the following results. 1. The problem of exactly computing the TV distance of two product distributions is #P-complete. This is in stark contrast with other distance measures such as KL, Chi-square, and Hellinger which tensorize over the marginals leading to efficient algorithms. 2. There is a fully polynomial-time deterministic approximation scheme (FPTAS) for computing the TV distance of two product distributions P and Q where Q is the uniform distribution. This result is extended to the case where Q has a constant number of distinct marginals. In contrast, we show that when P and Q are Bayes net distributions the relative approximation of their TV distance is NP-hard.
We consider the problem of efficiently inferring interventional distributions in a causal Bayesian network from a finite number of observations. Let P be a causal model on a set V of observable variables on a given causal graph G. For sets X, Y ⊆ V, and setting x to X, let P x (Y) denote the interventional distribution on Y with respect to an intervention x to variables X. Shpitser and Pearl (AAAI 2006), building on the work of Tian and Pearl (AAAI 2001), gave an exact characterization of the class of causal graphs for which the interventional distribution P x (Y) can be uniquely determined.We give the first efficient version of the Shpitser-Pearl algorithm. In particular, under natural assumptions, we give a polynomial-time algorithm that on input a causal graph G on observable variables V, a setting x of a set X ⊆ V of bounded size, outputs succinct descriptions of both an evaluator and a generator for a distribution P that is ε-close (in total variation distance) toWe also show that when Y is an arbitrary set, there is no efficient algorithm that outputs an evaluator of a distribution that is ε-close to P x (Y) unless all problems that have statistical zero-knowledge proofs, including the Graph Isomorphism problem, have efficient randomized algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.