The paper deals with the problem of determining the number of components in a mixture model. We take a Bayesian non-parametric approach and adopt a hierarchical model with a suitable non-parametric prior for the latent structure. A commonly used model for such a problem is the mixture of Dirichlet process model. Here, we replace the Dirichlet process with a more general non-parametric prior obtained from a generalized gamma process. The basic feature of this model is that it yields a partition structure for the latent variables which is of Gibbs type. This relates to the well-known (exchangeable) product partition models. If compared with the usual mixture of Dirichlet process model the advantage of the generalization that we are examining relies on the availability of an additional parameter "σ" belonging to the interval (0,1): it is shown that such a parameter greatly influences the clustering behaviour of the model. A value of "σ" that is close to 1 generates a large number of clusters, most of which are of small size. Then, a reinforcement mechanism which is driven by "σ" acts on the mass allocation by penalizing clusters of small size and favouring those few groups containing a large number of elements. These features turn out to be very useful in the context of mixture modelling. Since it is difficult to specify "a priori" the reinforcement rate, it is reasonable to specify a prior for "σ". Hence, the strength of the reinforcement mechanism is controlled by the data. Copyright 2007 Royal Statistical Society.
A. Lijoi and I. Prünster were supported in part by the Italian Ministry of University and Research (MIUR) research project "Bayesian Nonparametric Methods and Their Applications." The authors are grateful to two anonymous referees for valuable comments that have lead to a substantial improvement in the presentation. Special thanks also to Eugenio Regazzini for some helpful suggestions.In recent years the Dirichlet process prior has experienced a great success in the context of Bayesian mixture modeling. The idea of overcoming discreteness of its realizations by exploiting it in hierarchical models, combined with the development of suitable sampling techniques, represent one of the reasons of its popularity. In this article we propose the normalized inverse-Gaussian (N-IG) process as an alternative to the Dirichlet process to be used in Bayesian hierarchical models. The N-IG prior is constructed via its finite-dimensional distributions. This prior, although sharing the discreteness property of the Dirichlet prior, is characterized by a more elaborate and sensible clustering which makes use of all the information contained in the data. Whereas in the Dirichlet case the mass assigned to each observation depends solely on the number of times that it occurred, for the N-IG prior the weight of a single observation depends heavily on the whole number of ties in the sample. Moreover, expressions corresponding to relevant statistical quantities, such as a priori moments and the predictive distributions, are as tractable as those arising from the Dirichlet process. This implies that well-established sampling schemes can be easily extended to cover hierarchical models based on the N-IG process. The mixture of N-IG process and the mixture of Dirichlet process are compared using two examples involving mixtures of normals.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.