In this paper we investigate methods for learning hybrid Bayesian networks from data. First we utilize a kernel density estimate of the data in order to translate the data into a mixture of truncated basis functions (MoTBF) representation using a convex optimization technique. When utilizing a kernel density representation of the data, the estimation method relies on the specification of a kernel bandwidth. We show that in most cases the method is robust wrt. the choice of bandwidth, but for certain data sets the bandwidth has a strong impact on the result. Based on this observation, we propose an alternative learning method that relies on the cumulative distribution function of the data. Empirical results demonstrate the usefulness of the approaches: Even though the methods produce estimators that are slightly poorer than the state of the art (in terms of log-likelihood), they are significantly faster, and therefore indicate that the MoTBF framework can be used for inference and learning in reasonably sized domains. Furthermore, we show how a particular subclass of MoTBF potentials (learnable by the proposed methods) can be exploited to significantly reduce complexity during inference.
Abstract. Mixtures of Truncated Basis Functions (MoTBFs) have recently been proposed for modelling univariate and joint distributions in hybrid Bayesian networks. In this paper we analyse the problem of learning conditional MoTBF distributions from data. Our approach utilizes a new technique for learning joint MoTBF densities, then propose a method for using these to generate the conditional distributions. The main contribution of this work is conveyed through an empirical investigation into the properties of the new learning procedure, where we also compare the merits of our approach to those obtained by other proposals.
Mixtures of truncated basis functions have been recently proposed as a generalisation of mixtures of truncated exponentials and mixtures of polynomials for modelling univariate and conditional distributions in hybrid Bayesian networks. In this paper we analyse the problem of learning the parameters of marginal and conditional MoTBF densities when both prior knowledge and data are available. Incorporating prior knowledge provide a valuable tool for obtaining useful models, especially in domains of applications where data are costly or scarce, and prior knowledge is available from practitioners. We explore scenarios where the prior knowledge can be expressed as an MoTBF density that is afterwards combined with another MoTBF density estimated from the available data. The resulting model remains within the MoTBF class which is a convenient property from the point of view of inference in hybrid Bayesian networks. The performance of the proposed method is tested in a series of experiments carried out over synthetic and real data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.