2020
DOI: 10.3390/math8050662
|View full text |Cite
|
Sign up to set email alerts
|

Improving the Accuracy of Convolutional Neural Networks by Identifying and Removing Outlier Images in Datasets Using t-SNE

Abstract: In the field of supervised machine learning, the quality of a classifier model is directly correlated with the quality of the data that is used to train the model. The presence of unwanted outliers in the data could significantly reduce the accuracy of a model or, even worse, result in a biased model leading to an inaccurate classification. Identifying the presence of outliers and eliminating them is, therefore, crucial for building good quality training datasets. Pre-processing procedures for dealing with mis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 57 publications
(23 citation statements)
references
References 43 publications
0
19
0
Order By: Relevance
“…Before performing the Guassian process regression, LASSO regression was performed to identify the key features from all extracted features of the waveform. Then principal component analysis (PCA) was performed after LASSO regression to exclude outliners in the analysed dataset, as the outliers could affect the accuracy of machine learning algorithms [23]. The linear model module from the scikit-learn package was used to perform the LASSO regression in Python.…”
Section: Methodsmentioning
confidence: 99%
“…Before performing the Guassian process regression, LASSO regression was performed to identify the key features from all extracted features of the waveform. Then principal component analysis (PCA) was performed after LASSO regression to exclude outliners in the analysed dataset, as the outliers could affect the accuracy of machine learning algorithms [23]. The linear model module from the scikit-learn package was used to perform the LASSO regression in Python.…”
Section: Methodsmentioning
confidence: 99%
“…For images, it is possible to distinguish multiple descriptive features (from size and color intensity to contrast and noise). Outliers then could be found in an n-dimensional space (of n-features) [63].…”
Section: Data Anomaly Detectionmentioning
confidence: 99%
“…Manifold learning can operate wholly apart from clustering. One such use involves the removal of outliers before the identification of images through deep learning [144]. The canonical application of manifold learning, however, typically complements clustering to summarize and visualize high-dimensional data.…”
Section: Unsupervised Machine Learning In Overviewmentioning
confidence: 99%