The Bayesian approach to inference stands out for naturally allowing borrowing information across heterogeneous populations, with different samples possibly sharing the same distribution. A popular Bayesian nonparametric model for clustering probability distributions is the nested Dirichlet process, which however has the drawback of grouping distributions in a single cluster when ties are observed across samples. With the goal of achieving a flexible and effective clustering method for both samples and observations, we investigate a nonparametric prior that arises as the composition of two different discrete random structures and derive a closed-form expression for the induced distribution of the random partition, the fundamental tool regulating the clustering behavior of the model. On the one hand, this allows to gain a deeper insight into the theoretical properties of the model and, on the other hand, it yields an MCMC algorithm for evaluating Bayesian inferences of interest. Moreover, we single out limitations of this algorithm when working with more than two populations and, consequently, devise an alternative more efficient sampling scheme, which as a by-product, allows testing homogeneity between different populations. Finally, we perform a comparison with the nestedThis is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
Summary Dirichlet process mixtures are flexible nonparametric models, particularly suited to density estimation and probabilistic clustering. In this work we study the posterior distribution induced by Dirichlet process mixtures as the sample size increases, and more specifically focus on consistency for the unknown number of clusters when the observed data are generated from a finite mixture. Crucially, we consider the situation where a prior is placed on the concentration parameter of the underlying Dirichlet process. Previous findings in the literature suggest that Dirichlet process mixtures are typically not consistent for the number of clusters if the concentration parameter is held fixed and data come from a finite mixture. Here we show that consistency for the number of clusters can be achieved if the concentration parameter is adapted in a fully Bayesian way, as commonly done in practice. Our results are derived for data coming from a class of finite mixtures, with mild assumptions on the prior for the concentration parameter and for a variety of choices of likelihood kernels for the mixture.
Non-Gaussian state-space models arise in several applications, and within this framework the binary time series setting provides a relevant example. However, unlike for Gaussian state-space models — where filtering, predictive and smoothing distributions are available in closed form — binary state-space models require approximations or sequential Monte Carlo strategies for inference and prediction. This is due to the apparent absence of conjugacy between the Gaussian states and the likelihood induced by the observation equation for the binary data. In this article we prove that the filtering, predictive and smoothing distributions in dynamic probit models with Gaussian state variables are, in fact, available and belong to a class of unified skew-normals (sun) whose parameters can be updated recursively in time via analytical expressions. Also the key functionals of these distributions are, in principle, available, but their calculation requires the evaluation of multivariate Gaussian cumulative distribution functions. Leveraging sun properties, we address this issue via novel Monte Carlo methods based on independent samples from the smoothing distribution, that can easily be adapted to the filtering and predictive case, thus improving state-of-the-art approximate and sequential Monte Carlo inference in small-to-moderate dimensional studies. Novel sequential Monte Carlo procedures that exploit the sun properties are also developed to deal with online inference in high dimensions. Performance gains over competitors are outlined in a financial application.
We argue for the use of separate exchangeability as a modeling principle in Bayesian inference, especially for nonparametric Bayesian models. While in some areas, such as random graphs, separate and (closely related) joint exchangeability are widely used, and it naturally arises for example in simple mixed models, it is curiously underused for other applications. We briefly review the definition of separate exchangeability. We then discuss two specific models that implement separate exchangeability. One example is about nested random partitions for a data matrix, defining a partition of columns and nested partitions of rows, nested within column clusters. Many recently proposed models for nested partitions implement partially exchangeable models. We argue that inference under such models in some cases ignores important features of the experimental setup. The second example is about setting up separately exchangeable priors for a nonparametric regression model when multiple sets of experimental units are involved.
The ability to learn novel speech category contrasts is an important skill in second language learning. There is substantial individual variability in the ability to learn perceptual categories. Prior research demonstrates that higher working memory capacity is associated with better initial category acquisition, typically assessed within a single session of learning. There is mixed evidence for the role of working memory in speech category learning and the underlying mechanisms are not well understood. To better understand the role of working memory in speech category learning beyond initial acquisition, we trained participants on non-native Mandarin tone speech categories across three sessions, separated by one and two months, respectively. Examining all participants together, we found that working memory was associated with better speech category learning from initial acquisition to learning months later. However, when considering only participants who performed at above-chance levels in the task, we found that working memory was positively related to performance only in initial sessions and not in later learning. Working memory was positively associated with generalization of category knowledge to novel talkers and was unrelated to maintenance of category knowledge across sessions. Using longitudinal drift diffusion mixed models, we found that higher working memory was associated with more efficient evidence accumulation rates throughout learning and more cautious responding in later learning sessions. These results indicate that better working memory is not a guarantee of enhanced speech category learning, but it may reflect quicker and more efficient initial learning, which may be less effortful for a learner. Similarly, lower working memory does not doom a learner to poor performance but may instead be linked to higher risk of task disengagement and slower initial learning.
Multinomial probit (mnp) models are fundamental and widely-applied regression models for categorical data. [1] proved that the class of unified skew-normal distributions is conjugate to several mnp sampling models. This allows to develop Monte Carlo samplers and accurate variational methods to perform Bayesian inference. In this paper, we adapt the abovementioned results for a popular special case: the discrete-choice mnp model under zeromean and independent Gaussian priors. This allows to obtain simplified expressions for the parameters of the posterior distribution and an alternative derivation for the variational algorithm that gives a novel understanding of the fundamental results in [1] as well as computational advantages in our special settings.
Recently, [1] provided closed-form expressions for the filtering, predictive and smoothing distributions of multivariate dynamic probit models, leveraging on unified skewnormal distribution properties. This allows to develop algorithms to draw independent and identically distributed samples from such distributions, as well as sequential Monte Carlo procedures for the filtering and predictive distributions, allowing to overcome computational bottlenecks that may arise for large sample sizes. In this paper, we briefly review the above-mentioned closed-form expressions, mainly focusing on the smoothing distribution of the univariate dynamic probit. We develop a variational Bayes approach, extending the partially factorized mean-field variational approximation introduced by [2] for the static binary probit model to the dynamic setting. Results are shown for a financial application.
Dirichlet process mixtures are flexible non-parametric models, particularly suited to density estimation and probabilistic clustering. In this work we study the posterior distribution induced by Dirichlet process mixtures as the sample size increases, and more specifically focus on consistency for the unknown number of clusters when the observed data are generated from a finite mixture. Crucially, we consider the situation where a prior is placed on the concentration parameter of the underlying Dirichlet process. Previous findings in the literature suggest that Dirichlet process mixtures are typically not consistent for the number of clusters if the concentration parameter is held fixed and data come from a finite mixture. Here we show that consistency for the number of clusters can be achieved if the concentration parameter is adapted in a fully Bayesian way, as commonly done in practice. Our results are derived for data coming from a class of finite mixtures, with mild assumptions on the prior for the concentration parameter and for a variety of choices of likelihood kernels for the mixture.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.