2017
DOI: 10.1016/j.jss.2016.07.005
|View full text |Cite
|
Sign up to set email alerts
|

A survey on feature drift adaptation: Definition, benchmark, challenges and future directions

Abstract: Data stream mining is a fast growing research topic due to the ubiquity of data in several real-world problems. Given their ephemeral nature, data stream sources are expected to undergo changes in data distribution, a phenomenon called concept drift. This paper focuses on one specific type of drift that has not yet been thoroughly studied, namely feature drift. Feature drift occurs whenever a subset of features becomes, or ceases to be, relevant to the learning task, thus, learners must detect and adapt to the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
50
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 89 publications
(50 citation statements)
references
References 48 publications
0
50
0
Order By: Relevance
“…The Heterogeneous Ensemble with Feature drifT for Data Streams (HEFT-Stream) [23] is an online classifier that incorporates feature selection by applying the Fast Correlation Based Filter (FCBF) [26] algorithm that dynamically updates the relevant feature subsets for data streams. This is beneficial because non-stationary environments may present feature drift [23] [41]. In high-dimensional datasets, not all features are significant for training a classifier and the relevance of a feature may grow or shrink over time.…”
Section: Heterogeneous Ensemblesmentioning
confidence: 99%
“…The Heterogeneous Ensemble with Feature drifT for Data Streams (HEFT-Stream) [23] is an online classifier that incorporates feature selection by applying the Fast Correlation Based Filter (FCBF) [26] algorithm that dynamically updates the relevant feature subsets for data streams. This is beneficial because non-stationary environments may present feature drift [23] [41]. In high-dimensional datasets, not all features are significant for training a classifier and the relevance of a feature may grow or shrink over time.…”
Section: Heterogeneous Ensemblesmentioning
confidence: 99%
“…Recently, the works of [16] and [17] surveyed and evaluated different approaches to tackle this problem. According to these studies, incremental decision trees [18] and its variants [19] are the best performing approaches.…”
Section: Related Workmentioning
confidence: 99%
“…Regarding synthetic experiments, AGR represents the AGRAWAL [43] generator and AN is the Asset Negotiation generator [44]. BG1, BG2 and BG3 are synthetic generators based on binary features proposed in [45] that were recently used to synthetize feature drifts in [17]. Finally, the Random Tree Generator (RTG) was used to create more complex concepts (where the number of relevant features is bigger), while SEA [46] concepts depend on only 2 features.…”
Section: Experimental Protocolmentioning
confidence: 99%
See 1 more Smart Citation
“…There are different types of concept drift detection mechanisms for handling gradual or abrupt changes, blips or recurring drifts [ 24 , 26 , 27 , 28 , 52 ] that can be used to deal with changes in the market behavioral structure [ 53 ]. As opposed to stationary data distributions, where the error rate of the learning algorithm will decrease when the number of examples increases, the presence of changes affects the learning model continuously [ 54 ].…”
Section: Related Workmentioning
confidence: 99%