Experiments for the Number of Clusters in K-Means

Chiang, Mark Ming-Tso; Mirkin, Boris

doi:10.1007/978-3-540-77002-2_33

Cited by 28 publications

(24 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…If the result is 10 or greater, k + 1 clusters are preferable. Chiang and Mirkin (2007) reported experimental results supporting Hartigan's as the method that produces the most accurate number of clusters. An important issue was whether the same number of clusters was reached in each university sample.…”

Section: Discussionmentioning

confidence: 48%

Cluster Analysis of Undergraduate Drinkers Based on Alcohol Expectancy Scores

Leeman¹,

Kulesza

Stewart³

et al. 2012

J. Stud. Alcohol Drugs

View full text Add to dashboard Cite

ABSTRACT. Objective: Expectancies of alcohol's effects have been associated with problem drinking in undergraduates. If subgroups can be classifi ed based on expectancies, this may facilitate identifying those at highest risk for problem drinking. Method: Undergraduates (N = 612) from two state universities completed a web-based survey. Responses to the Comprehensive Effects of Alcohol scale were analyzed using kmeans cluster analysis separately within each university sample. Results: Hartigan's heuristic was used to determine that fi ve was the optimal number of clusters in each sample. Clusters were distinguishable based on their overall magnitude of expectancy endorsement and by a tendency to endorse stronger positive than negative expectancies. Subsequent analyses were conducted to compare clusters on alcohol involvement and trait disinhibition. A cluster characterized by endorsement of positive and negative expectancies ("strong expectancy") was associated with a particularly problematic risk profi le, specifi cally concerning diffi culties with self-control (i.e., trait disinhibition and impaired control over alcohol use). A cluster with higher positive and lower negative expectancies reported frequent heavy drinking but appeared to be at lower risk than the strong expectancy cluster in a number of respects. Negative expectancy endorsement appeared to represent added risk above and beyond positive expectancies. Conclusions: Results suggest that both the magnitude and combination of expectancies endorsed by subgroups of undergraduate drinkers may relate to their risk level in terms of alcohol involvement and personality traits. These fi ndings may have implications for interventions with young adult drinkers. (J. Stud. Alcohol Drugs, 73, 238-249, 2012)

show abstract

Section: Discussionmentioning

confidence: 48%

Cluster Analysis of Undergraduate Drinkers Based on Alcohol Expectancy Scores

Leeman¹,

Kulesza

Stewart³

et al. 2012

J. Stud. Alcohol Drugs

View full text Add to dashboard Cite

show abstract

“…have a high level of dynamism, the need to specify a priori the number of clusters is an important drawback of this clustering method. For this reason, the proposed solution uses dynamic clustering [6], which classifies learners on the basis of similar learning needs and interests, without requiring an initial indication of the number of clusters. In particular, the proposed approach uses the Silhouette index [5] to estimate the optimal number of clusters in which to group the data set and the K-means algorithm [4] to cluster the data set into the optimal, previously defined partition.…”

Section: The Dynamic Clustering Of Learnersmentioning

confidence: 99%

Recommendation of Collaborative Activities in E-learning Environments

Bitonto

Laterza

Roselli

et al. 2013

Human-Computer Interaction. Applications and Services

View full text Add to dashboard Cite

Abstract. In distance education environments, collaborative activities such as wikis, forums and chats play an important role in the e-learning experience because they promote communication among students and so allow cooperative learning settings to be implemented. Nevertheless, it could be difficult for learners to pick out the most interesting and appropriate collaborative activities to meet their learning needs. Recommender systems integrated in e-learning platforms are usually used mainly to help learners choose teaching resources, but they can also be useful to suggest the collaborative activities that best fit their learning objectives from a pedagogical point of view. In this context, the paper presents a recommendation approach able to suggest collaborative activities such as forums, chats, wikis and blogs, that combines dynamic clustering and prediction calculus on the basis of the learners' profiles and needs.

show abstract

“…The algorithm calculates the score of all possible number of clusters, provided in the range, and calculate their score using Bayesian Information Criterion (BIC) Akaike Information Criteria (AIC), and the one with the best score is output. There are in fact other kmeans variants that also attempt to find the right number of clusters by follow up analysis, but it has been experimentally proven in [12] that those do not provide results as consistent as iKMeans. Evidently the fact that iKMeans provides better results than other algorithms does not mean that it always provides good results.…”

Section: Kmeans and Ikmeansmentioning

confidence: 99%

“…Their cluster determined through the distances between the entities and the only other seed, the center itself. At the end, small clusters are removed according to a threshold and the final seeds are then used as initial seeds in the KMeans Algorithm, iKMeans has also been successfully applied to a number of different comparative experiments, such as in [12] The importance of finding good seeds has been object of much research, algorithms such as the kmeans++, introduced by [11] have already attempted this problem, but this for instance does not find the number of clusters, just better seeds for the given number of clusters. Xmeans, introduced by [8] is an example of an algorithm that in fact tries to determine the right number of clusters, but in this algorithm a range where the true number of clusters is has to be provided.…”

Section: Kmeans and Ikmeansmentioning

confidence: 99%