2010
DOI: 10.1007/s12530-010-9021-y
|View full text |Cite
|
Sign up to set email alerts
|

Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach

Abstract: Difficulties of learning from nonstationary data stream are generally twofold. First, dynamically structured learning framework is required to catch up with the evolution of unstable class concepts, i.e., concept drifts. Second, imbalanced class distribution over data stream demands a mechanism to intensify the underrepresented class concepts for improved overall performance. To alleviate the challenges brought by these issues, we propose the recursive ensemble approach (REA) in this paper. To battle against t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
61
0

Year Published

2012
2012
2019
2019

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 121 publications
(61 citation statements)
references
References 32 publications
0
61
0
Order By: Relevance
“…Chen and He propose REA [12] framework to learn on nonstationary imbalanced data streams which solves the flaw of SERA by adopting the k-nearest neighbors approach to estimate the similarity degree where each previous minority class example computes the number of minority class examples which are within its knearest neighbors of current training data chunk. REA retains all hypotheses built on each training data chunks over time and weighs them based on their classification performance on the current training data chunk.…”
Section: State Of Art On Ensemble Based Classification On Nonstatmentioning
confidence: 99%
“…Chen and He propose REA [12] framework to learn on nonstationary imbalanced data streams which solves the flaw of SERA by adopting the k-nearest neighbors approach to estimate the similarity degree where each previous minority class example computes the number of minority class examples which are within its knearest neighbors of current training data chunk. REA retains all hypotheses built on each training data chunks over time and weighs them based on their classification performance on the current training data chunk.…”
Section: State Of Art On Ensemble Based Classification On Nonstatmentioning
confidence: 99%
“…The diversity term can be characterized as being 'good' or 'bad' [26], a result that has a corresponding observation in EC [187]. Constructing new ensemble member Diversity of base learner [65,102] Sample stream data using Boosting versus Bagging [147,150] Identify ensemble member for replacement Age based heuristics [167] Performance based heuristics [107,171] Class imbalance Effect of sampling biases [34,55,86,186] Drift management Incremental updating of current models [107,158] Adapt voting weights [2,90,152] Shift management Outright replacement of one or more ensemble member [68,81,167] Diversity management Impact on capacity for change [26,137,165] Genet Program Evolvable Mach Under non-stationary data, it has been established that reducing the absolute value for the ensemble margin produces an equivalent increase in diversity [165]. A second open question is in regard to the method assumed for combining the outcome from multiple models under a non-stationary task.…”
Section: Ensemble ML Perspectivementioning
confidence: 99%
“…That is to say, the distribution of classes represented in the batch can be artificially balanced with less frequently occurring classes relying on historical samples, whereas the most frequently occurring classes assume the most recent samples (e.g., [82]). Various schemes have been proposed for prioritizing retention of minor class exemplars within the batch used to construct classifiers, with k-NN algorithms frequently appearing for this purpose [34,86]. Conversely, Ditzler and Polikar [55] propose to employ oversampling or data rebalancing with ensemble methods, but not without incurring computational overheads that might limit the applicability to streaming data.…”
Section: Class Imbalancementioning
confidence: 99%
“… Chapter 6 presents RLS-SSL (RLS-Semi Supervised Learning) which uses S4VM (Chen & He, 2011) as its semi-supervised learning method in conjunction with our partially labeled method to improve the quality of performance.…”
Section: Thesis Contributionsmentioning
confidence: 99%
“…Hence, (Chen & He, 2011) proposed Recursive Ensemble Approach 14 (REA) which uses K-nearest neighbors as the similarity measure to address this issue.…”
Section: Supervised-fully Labeled Datamentioning
confidence: 99%