2018
DOI: 10.1002/sta4.177
|View full text |Cite
|
Sign up to set email alerts
|

Flexible clustering of high‐dimensional data via mixtures of joint generalized hyperbolic distributions

Abstract: A mixture of joint generalized hyperbolic distributions (MJGHD) is introduced for asymmetric clustering for highdimensional data. The MJGHD approach takes into account the cluster-specific subspaces, thereby limiting the number of parameters to estimate while also facilitating visualization of results. Identifiability is discussed, and a multi-cycle expectation-conditional maximization algorithm is outlined for parameter estimation. The MJGHD approach is illustrated on two real data sets, where the Bayesian in… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
2

Relationship

6
2

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 30 publications
(45 reference statements)
0
7
0
Order By: Relevance
“…However, if the objective is dealing with outliers, it will be better to consider the PDQ approach with the multivariate contaminated normal distribution [25] and this will be a topic of future work. Other approaches for handling cluster concentration will also be considered (e.g., [9]) as will methods that accommodate asymmetric, or skewed, clusters (e.g., [18,19,21,22,32,34]). Let f (x i ; k , k ) be the generic symmetric unimodal multivariate density function of the random variable with parameter k and location parameter k then satisfies all the three properties and it is a dissimilarity measure for k = 1, … , K.…”
Section: Resultsmentioning
confidence: 99%
“…However, if the objective is dealing with outliers, it will be better to consider the PDQ approach with the multivariate contaminated normal distribution [25] and this will be a topic of future work. Other approaches for handling cluster concentration will also be considered (e.g., [9]) as will methods that accommodate asymmetric, or skewed, clusters (e.g., [18,19,21,22,32,34]). Let f (x i ; k , k ) be the generic symmetric unimodal multivariate density function of the random variable with parameter k and location parameter k then satisfies all the three properties and it is a dissimilarity measure for k = 1, … , K.…”
Section: Resultsmentioning
confidence: 99%
“…Until recently, the component densities have typically been Gaussian distributed, and several parsimonious extensions of Gaussian mixtures for high-dimensional data have been proposed (e.g., Ghahramani and Hinton 1997;McLachlan, Peel, and Bean 2003;Bouveyron, Girard, and Schmid 2007;Murphy 2008, 2010;Baek, McLachlan, and Flack 2010;Montanari and Viroli 2011). Recently, the focus of the literature has been on mixtures of non-Gaussian distributions for high-dimensional datasets (e.g., Andrews and McNicholas 2011a,b;Steane, McNicholas, and Yada 2012;Lin, McNicholas, and Hsiu 2014;Murray, McNicholas, and Browne 2014b;Murray, Browne, and McNicholas 2014a;Lin, McLachlan, and Lee 2016;McNicholas, McNicholas, and Browne 2017;Tang, Browne, and McNicholas 2018;Kim and Browne 2019;Murray, Browne, and McNicholas 2020;Punzo, Blostein, and McNicholas 2020). Of particular interest is the generalized hyperbolic distribution (GHD) which can detect clusters with non-elliptical form because it contains skewness, concentration, and index parameters.…”
Section: Introductionmentioning
confidence: 99%
“…A little beyond the turn of the century, work on t-mixtures burgeoned into a substantial subfield of mixture model-based classification (e.g., McLachlan et al, 2007;McNicholas, 2011a,b, 2012;Baek and McLachlan, 2011;Steane et al, 2012;Lin et al, 2014;Pesevski et al, 2018). Around the same time, work on mixtures of skewed distributions took off, including work on skew-normal mixtures (e.g., Lin, 2009), skewt mixtures (e.g., Lin, 2010;McNicholas, 2012, 2014;Lee and McLachlan, 2013a,b;Murray et al, 2014), Laplace mixtures (e.g., Franczak et al, 2014), variance-gamma mixtures (McNicholas et al, 2017), generalized hyperbolic mixtures (Browne and McNicholas, 2015), and other non-elliptically contoured distributions (e.g., Karlis and Santourian, 2009;Murray et al, 2017;Tang et al, 2018). A thorough review of work on model-based clustering is given by McNicholas (2016b).…”
Section: Introductionmentioning
confidence: 99%