2013
DOI: 10.1145/2522968.2522981
|View full text |Cite
|
Sign up to set email alerts
|

Data stream clustering

Abstract: Data stream mining is an active research area that has recently emerged to discover knowledge from large amounts of continuously generated data. In this context, several data stream clustering algorithms have been proposed to perform unsupervised learning. Nevertheless, data stream clustering imposes several challenges to be addressed, such as dealing with nonstationary, unbounded data that arrive in an online fashion. The intrinsic nature of stream data requires the development of algorithms capable of perfor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
241
0
32

Year Published

2016
2016
2022
2022

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 444 publications
(287 citation statements)
references
References 87 publications
0
241
0
32
Order By: Relevance
“…In order to train the Maximum Entropy model with a very limited training dataset, we need to convert attributes that have continuous numeric values into discrete ones. There has been a lot of research done on continuous feature discretization field [27][28][29][30][31][32]. Methods for discretization are broadly classified into Supervised vs. Unsupervised, Global vs. Local, and Static vs.…”
Section: K-means Clusteringmentioning
confidence: 99%
“…In order to train the Maximum Entropy model with a very limited training dataset, we need to convert attributes that have continuous numeric values into discrete ones. There has been a lot of research done on continuous feature discretization field [27][28][29][30][31][32]. Methods for discretization are broadly classified into Supervised vs. Unsupervised, Global vs. Local, and Static vs.…”
Section: K-means Clusteringmentioning
confidence: 99%
“…Most of the conventional learning techniques assume that there is a static dataset generated by an unknown yet stationary probability distribution, which can be stored and analyzed in multiple steps. Nevertheless, none of the latter assumptions are verifiable in several streaming scenarios and the development of new learners must account for several constraints [1,2,10,21,22,30,33]:…”
Section: Learning From Data Streamsmentioning
confidence: 99%
“…Nonetheless, none of the latter assumptions can be verified in the streaming scenario and the development of algorithms must account for several constraints [2,21,33]. Firstly, instances arrive continuously over time and there is no control over the order that they arrive nor how they should be processed.…”
Section: Concept Driftmentioning
confidence: 99%
See 1 more Smart Citation
“…Required information for forming clusters is provided by core micro-clusters and outlier micro-clusters. The major drawback is the computational cost is more [8].D-Stream is grid based clustering method. In the online phase each data record is mapped to a grid.…”
mentioning
confidence: 99%