Analysis of the Maximal a Posteriori Partition in the Gaussian Dirichlet Process Mixture Model

Rajkowski, Łukasz

doi:10.1214/18-ba1114

Cited by 3 publications

(13 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In Rajkowski [2018] it is proved that in the Normal Bayesian Mixture Model with Normal distribution on the component mean and fixed covariance matrix, when the prior on the space of partitions is the Chinese Restaurant Process, the convex hulls of the clusters in the MAP partition are disjoint. Equivalently, the MAP is linearly separable.…”

Section: Resultsmentioning

confidence: 99%

“…In Rajkowski [2018] the linear separability of the MAP partition is crucial for establishing the existence of 'limits' of the MAP partitions when the prior on partitions is the Chinese Restaurant Process and the data is independently and identically distributed with some 'input distribution'. The limit is related to the partitions of observation space which maximises a given functional ∆ (which depends only on the hypeerparameter Σ0 and the input distribution).…”

Section: Discussion Of Potential Applicationsmentioning

confidence: 99%

“…If we assume that the component parameters in the Normal Bayesian Mixture Model are distributed by (2.24) then we assume that the covariance matrix in each component is equal to Σ0 which is known to us. The results of Rajkowski [2018] imply that the misspecification of this hyperparameter may lead to serious inference issues regarding the number of clusters, at least as far as the MAP partition is concerned. In the light of these findings, (2.18) seems to be a safer choice of the prior for the component parameters.…”

Section: Comparison Of the Modelsmentioning

confidence: 99%

“…It only allows a 1-parameter variation of the covariance function, but no restrictions are imposed on the within-group means, unlike the Normal-inverse-Wishart prior. At the same time, by allowing the component covariance matrix to scale between clusters can be a remedy to the drawbacks of fixed covariance prior that were pointed out in Rajkowski [2018].…”

Section: Comparison Of the Modelsmentioning

confidence: 99%

“…In Rajkowski [2018] it is proved that in the Normal Bayesian Mixture Model, when the component covariance matrix is fixed and the prior on the component mean is Normal, the MAP partition is convex, i.e. the convex hulls of clusters are disjoint.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

A note on the geometry of the MAP partition in Conjugate Exponential Bayesian Mixture Models

Rajkowski,

Noble

2019

Preprint

Self Cite

View full text Add to dashboard Cite

We investigate the geometry of the maximal a posteriori (MAP) partition in the Bayesian Mixture Model where the component and the base distributions are chosen from conjugate exponential families. We prove that in this case the clusters are separated by the contour lines of a linear functional of the sufficient statistic. As a particular example, we describe Bayesian Mixture of Normals with Normal-inverse-Wishart prior on the component mean and covariance, in which the clusters in any MAP partition are separated by a quadratic surface. In connection with results of Rajkowski [2018], where the linear separability of clusters in the Bayesian Mixture Model with a fixed component covariance matrix was proved, it gives a nice Bayesian analogue of the geometric properties of Fisher Discriminant Analysis (LDA and QDA).

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Discussion Of Potential Applicationsmentioning

confidence: 99%

Section: Comparison Of the Modelsmentioning

confidence: 99%

Section: Comparison Of the Modelsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

A note on the geometry of the MAP partition in Conjugate Exponential Bayesian Mixture Models

Rajkowski,

Noble

2019

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

Bayesian cluster analysis

Wade

2023

Phil. Trans. R. Soc. A.

View full text Add to dashboard Cite

Bayesian cluster analysis offers substantial benefits over algorithmic approaches by providing not only point estimates but also uncertainty in the clustering structure and patterns within each cluster. An overview of Bayesian cluster analysis is provided, including both model-based and loss-based approaches, along with a discussion on the importance of the kernel or loss selected and prior specification. Advantages are demonstrated in an application to cluster cells and discover latent cell types in single-cell RNA sequencing data to study embryonic cellular development. Lastly, we focus on the ongoing debate between finite and infinite mixtures in a model-based approach and robustness to model misspecification. While much of the debate and asymptotic theory focuses on the marginal posterior of the number of clusters, we empirically show that quite a different behaviour is obtained when estimating the full clustering structure. This article is part of the theme issue ‘Bayesian inference: challenges, perspectives, and prospects’.

show abstract

Analysis of the Maximal a Posteriori Partition in the Gaussian Dirichlet Process Mixture Model

Rajkowski¹

2019

Bayesian Anal.

Self Cite

View full text Add to dashboard Cite

Mixture models are a natural choice in many applications, but it can be difficult to place an a priori upper bound on the number of components. To circumvent this, investigators are turning increasingly to Dirichlet process mixture models (DPMMs). It is therefore important to develop an understanding of the strengths and weaknesses of this approach. This work considers the MAP (maximum a posteriori) clustering for the Gaussian DPMM (where the cluster means have Gaussian distribution and, for each cluster, the observations within the cluster have Gaussian distribution). Some desirable properties of the MAP partition are proved: 'almost disjointness' of the convex hulls of clusters (they may have at most one point in common) and (with natural assumptions) the comparability of sizes of those clusters that intersect any fixed ball with the number of observations (as the latter goes to infinity). Consequently, the number of such clusters remains bounded. Furthermore, if the data arises from independent identically distributed sampling from a given distribution with bounded support then the asymptotic MAP partition of the observation space maximises a function which has a straightforward expression, which depends only on the within-group covariance parameter. As the operator norm of this covariance parameter decreases, the number of clusters in the MAP partition becomes arbitrarily large, which may lead to the overestimation of the number of mixture components. Restriction of this partition to sets [n] for n ∈ N is called the Chinese Restaurant Process.Definition. The Chinese Restaurant Process with parameter α is a sequence of random partitions (J n ) n∈N , where J n is a partition of [n] = {1, 2, . . . , n}, that satisfies J n+1 | J n = {J 1 , . . . , J k } ∼ {J 1 , . . . , J i ∪ {n + 1}, . . . , J k } with probability |Ji| n+α {J 1 , . . . , J k , {n + 1}} with probability α n+α .We write J n ∼ CRP(α) n . 4The Dirichlet Process mixture model for n observations is therefore equivalent toWe will refer to this formulation as the CRP-based model. In this paper we focus our attention on the Gaussian case, in which Θ = R d , X = R d , F and B are σ-fields of Borel sets, G 0 = N (µ, T) and F θ = N (θ, Σ) for θ ∈ Θ, where µ ∈ R d and T, Σ ∈ R d,d are the parameters of the model. This will be called the CRP-based Gaussian model. We also limit ourselves to the case where µ = 0, however it may be easily seen that this is not a real restriction; the sampling from the zero-mean Gaussian model and transposing the output by the vector µ is equivalent to sampling from the Gaussian model with mean µ. Therefore all the clustering properties of the model can be investigated with the assumption that µ = 0.where for F G 0 J is a probability distribution on X |J| with the densitywith respect to product measure ν |J| . We now compute the exact formula for f G 0 J when G0 = N (0, T ) and F θ = N (θ, Σ).

show abstract

Analysis of the Maximal a Posteriori Partition in the Gaussian Dirichlet Process Mixture Model

Cited by 3 publications

References 13 publications

A note on the geometry of the MAP partition in Conjugate Exponential Bayesian Mixture Models

A note on the geometry of the MAP partition in Conjugate Exponential Bayesian Mixture Models

Bayesian cluster analysis

Analysis of the Maximal a Posteriori Partition in the Gaussian Dirichlet Process Mixture Model

Contact Info

Product

Resources

About