1996
DOI: 10.1145/235968.233324
|View full text |Cite
|
Sign up to set email alerts
|

Birch

Abstract: Finding useful patterns in large datasets has attracted considerable interest recently, and one of the most widely studied problems in this area is the identification of clusters, or densely populated regions, in a multi-dimensional dataset. Prior work does not adequately address the problem of large datasets and minimization of I/O costs.This paper presents a data clustering method named BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies), and d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
161
0
5

Year Published

2007
2007
2022
2022

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 1,767 publications
(218 citation statements)
references
References 5 publications
0
161
0
5
Order By: Relevance
“…In order to reveal the grouping of tourists according to their preferred benefits, a twostep cluster analysis has been applied using the log-likelihood measure (Chiu et al, 2001;Zhang et al, 1996). The number of clusters have been determined using Schwarz's Bayesian Criterion (BIC).…”
Section: Resultsmentioning
confidence: 99%
“…In order to reveal the grouping of tourists according to their preferred benefits, a twostep cluster analysis has been applied using the log-likelihood measure (Chiu et al, 2001;Zhang et al, 1996). The number of clusters have been determined using Schwarz's Bayesian Criterion (BIC).…”
Section: Resultsmentioning
confidence: 99%
“…We present an algorithm to sample the simulation data, but Monte Carlo methods [Banfield & Raftery (1993)] could be used by adding a repetitive randomness process as a way of guaranteeing representation of a beam candidate region and improvement of accuracy. Future evaluations may consider more sophisticated methods such as Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH) as in Zhang et al (1996) and hierarchical clustering based on granularity as in Liang & Li (2007), which are designed for very large data sets. Further investigation should also include subspace clustering as in Kriegel et al (2009) once the large simulation datasets contain target regions that can be determined using the techniques proposed in our framework.…”
Section: Discussionmentioning
confidence: 99%
“…We applied a hierarchical clustering technique and employed an average distance metric to determine distances between clusters that might be merged in each step of the clustering process (Kalkstein et al 1987; Zhang et al 1996). Average distance is calculated using the formula…”
Section: Methodsmentioning
confidence: 99%