Łukasz Rajkowski scite author profile

Łukasz Rajkowski

2Publications

13Citation Statements Received

13Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

Analysis of the Maximal a Posteriori Partition in the Gaussian Dirichlet Process Mixture Model

Rajkowski¹

2019

Bayesian Anal.

View full text Add to dashboard Cite

Mixture models are a natural choice in many applications, but it can be difficult to place an a priori upper bound on the number of components. To circumvent this, investigators are turning increasingly to Dirichlet process mixture models (DPMMs). It is therefore important to develop an understanding of the strengths and weaknesses of this approach. This work considers the MAP (maximum a posteriori) clustering for the Gaussian DPMM (where the cluster means have Gaussian distribution and, for each cluster, the observations within the cluster have Gaussian distribution). Some desirable properties of the MAP partition are proved: 'almost disjointness' of the convex hulls of clusters (they may have at most one point in common) and (with natural assumptions) the comparability of sizes of those clusters that intersect any fixed ball with the number of observations (as the latter goes to infinity). Consequently, the number of such clusters remains bounded. Furthermore, if the data arises from independent identically distributed sampling from a given distribution with bounded support then the asymptotic MAP partition of the observation space maximises a function which has a straightforward expression, which depends only on the within-group covariance parameter. As the operator norm of this covariance parameter decreases, the number of clusters in the MAP partition becomes arbitrarily large, which may lead to the overestimation of the number of mixture components. Restriction of this partition to sets [n] for n ∈ N is called the Chinese Restaurant Process.Definition. The Chinese Restaurant Process with parameter α is a sequence of random partitions (J n ) n∈N , where J n is a partition of [n] = {1, 2, . . . , n}, that satisfies J n+1 | J n = {J 1 , . . . , J k } ∼ {J 1 , . . . , J i ∪ {n + 1}, . . . , J k } with probability |Ji| n+α {J 1 , . . . , J k , {n + 1}} with probability α n+α .We write J n ∼ CRP(α) n . 4The Dirichlet Process mixture model for n observations is therefore equivalent toWe will refer to this formulation as the CRP-based model. In this paper we focus our attention on the Gaussian case, in which Θ = R d , X = R d , F and B are σ-fields of Borel sets, G 0 = N (µ, T) and F θ = N (θ, Σ) for θ ∈ Θ, where µ ∈ R d and T, Σ ∈ R d,d are the parameters of the model. This will be called the CRP-based Gaussian model. We also limit ourselves to the case where µ = 0, however it may be easily seen that this is not a real restriction; the sampling from the zero-mean Gaussian model and transposing the output by the vector µ is equivalent to sampling from the Gaussian model with mean µ. Therefore all the clustering properties of the model can be investigated with the assumption that µ = 0.where for F G 0 J is a probability distribution on X |J| with the densitywith respect to product measure ν |J| . We now compute the exact formula for f G 0 J when G0 = N (0, T ) and F θ = N (θ, Σ).

show abstract

Analysis of the maximal posterior partition in the Dirichlet Process Gaussian Mixture Model

Rajkowski¹

2016

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.