2008
DOI: 10.1214/08-ba304
|View full text |Cite
|
Sign up to set email alerts
|

How many clusters?

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
57
0

Year Published

2011
2011
2021
2021

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 68 publications
(58 citation statements)
references
References 22 publications
0
57
0
Order By: Relevance
“…See [1] for a relatively recent overview of this literature. Well-behaved mathematically tractable models of random partitions are of interest to probabilists as well as statisticians and scientists; see [10], [12], [13], and [15]. Ewens [10] first introduced the Ewens' sampling formula in the context of theoretical population biology.…”
Section: Introductionmentioning
confidence: 99%
“…See [1] for a relatively recent overview of this literature. Well-behaved mathematically tractable models of random partitions are of interest to probabilists as well as statisticians and scientists; see [10], [12], [13], and [15]. Ewens [10] first introduced the Ewens' sampling formula in the context of theoretical population biology.…”
Section: Introductionmentioning
confidence: 99%
“…We used 6 servers as slave machines for both of the proposed framework and Hadoop: 4 servers with 4-core 2.8 GHz CPU and 4 GB memory, and 2 servers with two of 4-core 2.53 GHz CPU and 2 GB memory. In Table 4 shows execution times of one iteration on three machine learning algorithms: K-Means [2], Dirichlet process clustering [12] and IPM perceptron [13,14]. The values are mean and standard deviation over 10 runs.…”
Section: Discussionmentioning
confidence: 99%
“…Table 4. Comparison of the parallel machine learning framework and Mahout on K-Means [2], Dirichlet process clustering [12] and IPM perceptron [13,14]. We also applied the framework in order to parallelize a learning algorithm of an acoustic model for speech recognition.…”
Section: Discussionmentioning
confidence: 99%
“…Somewhat unfortunately, this heuristic method tends to fail in actual applications when the number of dimensions increases. More formal inspiration for clusterability analysis is provided by considerations on the number of clusters (main references include [3][4][5][6]) and component overlap analysis for mixtures of normal distributions (see [7][8][9][10]). However, both approaches originally assume the underlying partition to be already determined.…”
Section: State-of-the-artmentioning
confidence: 99%