2015
DOI: 10.1609/aaai.v29i1.9588
|View full text |Cite
|
Sign up to set email alerts
|

Detecting and Tracking Concept Class Drift and Emergence in Non-Stationary Fast Data Streams

Abstract: As the proliferation of constant data feeds increases from social media, embedded sensors, and other sources, the capability to provide predictive concept labels to these data streams will become ever more important and lucrative. However, the dynamic, non-stationary nature, and effectively infinite length of data streams pose additional challenges for stream data mining algorithms. The sparse quantity of training data also limits the use of algorithms that are heavily dependent on supervised training. To addr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
6
1
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 19 publications
(7 citation statements)
references
References 10 publications
0
7
0
Order By: Relevance
“…This paper aims at building a classification pipeline from evolving data streams. Several different online learning, or stream learning, algorithms have been proposed in the literature to deal with evolving data streams in supervised [53,54], unsupervised [55] and semi-supervised settings [56]. The static data from AMIGOS and from Experiment I was streamed using a library for online learning: river [57].…”
Section: Online Learning and Progressive Validationmentioning
confidence: 99%
“…This paper aims at building a classification pipeline from evolving data streams. Several different online learning, or stream learning, algorithms have been proposed in the literature to deal with evolving data streams in supervised [53,54], unsupervised [55] and semi-supervised settings [56]. The static data from AMIGOS and from Experiment I was streamed using a library for online learning: river [57].…”
Section: Online Learning and Progressive Validationmentioning
confidence: 99%
“…Approaches that divide data streams into fixed-size chunks, e.g., (Parker and Khan 2015) cannot capture concept drift immediately if the chunk size is too large, or suffer from unnecessary frequent training during stable time period if the chunk size is too small (Bifet and Gavald 2007). Gradual forgetting is also used in the literature, e.g., (Klinkenberg 2004) to address the infinite length problem of data streams.…”
Section: Related Workmentioning
confidence: 99%
“…Infinite length problem of data streams is typically addressed by dividing the stream into fixed-size chunks, e.g., (Parker and Khan 2015) or using gradual forgetting, e.g., (Klinkenberg 2004). Since data streams are evolving in nature, without prior knowledge of the time-scale of change, both strategies suffer from a trade-off between performance and sensitivity.…”
Section: Introductionmentioning
confidence: 99%
“…The problem of classification under Streaming Emerging New Class (SENC), which aims to maintain the predictive accuracy for identifying the novel class and the known class in the stream, has recently been attracting an increasing amount of attention and effort due to both the significant research challenges (Masud et al 2011;Parker and Khan 2015;Haque, Khan, and Baron 2016) and the immense practical value (Abdallah et al 2016). A common example in industry is the topic categorization in the news stream, where a new topic of news may arise when a new event occurred.…”
Section: Introductionmentioning
confidence: 99%