Data Warehousing and Mining 2008
DOI: 10.4018/978-1-59904-951-9.ch203
|View full text |Cite
|
Sign up to set email alerts
|

Robust Classification Based on Correlations Between Attributes

Abstract: The existence of noise in the data significantly impacts the accuracy of classification. In this article, we are concerned with the development of novel classification algorithms that can efficiently handle noise. To attain this, we recognize an analogy between k nearest neighbors (kNN) classification and user-based collaborative filtering algorithms, as they both find a neighborhood of similar past data and process its contents to make a prediction about new data. The recent development of item-based collabor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 5 publications
0
4
0
Order By: Relevance
“…With INFUSE [58], users can also apply interactive feature selection tasks, here to support prediction modeling. Finding features to characterize TSEQs can often be based on work in extracting temporal features from time-series data [78], on applying metrics [92], or both. However, our task to extract features through metrics was considerably impeded by the fact that most inspiring work for time series and classical event sequences takes the value information into account, which does not exist for TSEQs.…”
Section: Visual Analysis For Data Simplificationmentioning
confidence: 99%
See 1 more Smart Citation
“…With INFUSE [58], users can also apply interactive feature selection tasks, here to support prediction modeling. Finding features to characterize TSEQs can often be based on work in extracting temporal features from time-series data [78], on applying metrics [92], or both. However, our task to extract features through metrics was considerably impeded by the fact that most inspiring work for time series and classical event sequences takes the value information into account, which does not exist for TSEQs.…”
Section: Visual Analysis For Data Simplificationmentioning
confidence: 99%
“…Experts also pointed out gaps, outliers, periodicity, subsequence length, and dense regions as important. The awareness of their relevance also helped us to prioritize metrics identified in the literature: Related works come from the domain of statistical metrics for time-series [78], where we identified a subset of metrics, applicable for TSEQs. Summary metrics such as the number of events or minimum, maximum, and mean length of sequences form a source for features at the TSEQs granularity.…”
Section: Metricsmentioning
confidence: 99%
“…The advantages include simplicity in merely computing the support and confidence values for estimating which target label one instance should be classified into, no persistent tree structure or trained model needs to be retained except small registers for statistics, and the samples (reference) required for noise detection can scale flexibly to any amount (≤ ). One example that is inspired by [27] about a weighted PWC is shown in Figure 4.…”
Section: Contradiction Analysismentioning
confidence: 99%
“…The advantages include simplicity in merely computing the supports and confidence values for estimating which target label one instance should be classified into, no persistent tree structure or trained model needs to be retained except small registers for statistics, and the samples (reference) required for noise detection can scale flexibly to any amount (≤ W ). One example that is based on [17] about a weighted PWC is shown in Figure 3.…”
Section: Our Proposed Data Stream Mining Modelmentioning
confidence: 99%