Mining with Noise Knowledge: Error Aware Data Mining

Pradhan, S.; Singh, Rajveer; Kachru, Komal; Narasimhamurthy, S. K.

doi:10.1109/cis.2007.7

Cited by 22 publications

(22 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The issue of data quality or veracity has been considered by a number of researchers [39], including data complexity [9], missing values [19], noise [58], imbalance [13], and dataset shift [39]. The latter, dataset shift, is most profound in the case of big data as the unseen data may present a distribution that is not seen in the training data.…”

Section: Data Mining/science With Big Datamentioning

confidence: 99%

Big Data Opportunities and Challenges: Discussions from Data Analytics Perspectives [Discussion Forum]

Zhou

Chawla

Jin

et al. 2014

IEEE Comput. Intell. Mag.

237

100

View full text Add to dashboard Cite

Abstract-"Big Data" as a term has been among the biggest trends of the last three years, leading to an upsurge of research, as well as industry and government applications. Data is deemed a powerful raw material that can impact multidisciplinary research endeavors as well as government and business performance. The goal of this discussion paper is to share the data analytics opinions and perspectives of the authors relating to the new opportunities and challenges brought forth by the big data movement. The authors bring together diverse perspectives, coming from different geographical locations with different core research expertise and different affiliations and work experiences. The aim of this paper is to evoke discussion rather than to provide a comprehensive survey of big data research.

show abstract

Section: Data Mining/science With Big Datamentioning

confidence: 99%

Big Data Opportunities and Challenges: Discussions from Data Analytics Perspectives [Discussion Forum]

Zhou

Chawla

Jin

et al. 2014

IEEE Comput. Intell. Mag.

237

100

View full text Add to dashboard Cite

show abstract

“…Generally, noisy data in the classification problems could be organized in three groups [10][11][12][13][14]. i) Data that their corresponding labels include noise (paradoxical labeling error for a data point or misclassifications errors .…”

Section: Introductionmentioning

confidence: 99%

A New Fuzzy Membership Assignment Approach For Fuzzy Svm Based On Adaptive Pso In Classification Problems

Almasi¹,

Gooqeri²,

Asl³

et al. 2015

J. Math. Computer Sci.

View full text Add to dashboard Cite

Noises will confuse Support Vector Machine (SVM) in the training phase. To overcome this problem, SVM was extended to Fuzzy SVM (FSVM) by incorporating an appropriate fuzzy membership to each data point. Thus, how to choose a proper fuzzy membership is of paramount importance in FSVM. In this paper, Adaptive Particle Swarm Optimization (APSO) method minimizes the generalization error by changing the attributes values of positive and negative class centers to make them free of attribute-noise. As the APSO converged, the fuzzy memberships are assigned for each training data points based on their distance to the corresponding purified class centers with the same class-label. To demonstrate the effectiveness of the proposed FSVM, its performance on artificial and real-world data sets is compared with three FSVM algorithms in the literature.

show abstract

“…It is expected that the whole process starts with raw data and finishes with the extracted knowledge. Because of its data-driven nature, previous research efforts have concluded that data mining results crucially rely on the quality of the underlying data, and for most of the data mining applications, the process of data collection, data preparation, and data enhancement cost the majority of the project budget and also the developing time circle [18].…”

Section: Study Of the Certainty In The Training Samplesmentioning

confidence: 99%