Computer Science, Communication and Instrumentation Devices 2014
DOI: 10.3850/978-981-09-5247-1_017
|View full text |Cite
|
Sign up to set email alerts
|

Methods to Avoid Over-Fitting and Under-Fitting in Supervised Machine Learning (Comparative Study)

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
74
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 151 publications
(105 citation statements)
references
References 13 publications
0
74
0
Order By: Relevance
“…ANNs face several issues that reduce their performance or contort results. Among them are overfitting (Allamy, 2014;Zhang et al, 2018) and underfitting (Allamy, 2014), data scarcity, the need for normalization, data imbalance and outlier influence (Khamis, Ismail, Khalid, & Tarmizi Mohammed, 2005). These issues were addressed using methods such as dropout (Park & Kwak, 2017), augmentation (jitter (pure Gaussian noise) and warp (Gaussian noise on Bezier-Curves))(Le Guennec, Malinowski, & Tavenard, 2016;Um et al, 2017;Velasco, Garnica, Lanchares, Botella, & Ignacio Hidalgo, 2018;Xiao & Xu, 2012), synthetic minority oversampling technique (SMOTE) (Fernández, García, Herrera, & Chawla, 2018), interquartile range (IQR) scaling (Mizera et al, 2004) and median absolute deviation (MAD) (Gorard, 2013) based Gaussian noise data completion.…”
Section: Annsmentioning
confidence: 99%
See 1 more Smart Citation
“…ANNs face several issues that reduce their performance or contort results. Among them are overfitting (Allamy, 2014;Zhang et al, 2018) and underfitting (Allamy, 2014), data scarcity, the need for normalization, data imbalance and outlier influence (Khamis, Ismail, Khalid, & Tarmizi Mohammed, 2005). These issues were addressed using methods such as dropout (Park & Kwak, 2017), augmentation (jitter (pure Gaussian noise) and warp (Gaussian noise on Bezier-Curves))(Le Guennec, Malinowski, & Tavenard, 2016;Um et al, 2017;Velasco, Garnica, Lanchares, Botella, & Ignacio Hidalgo, 2018;Xiao & Xu, 2012), synthetic minority oversampling technique (SMOTE) (Fernández, García, Herrera, & Chawla, 2018), interquartile range (IQR) scaling (Mizera et al, 2004) and median absolute deviation (MAD) (Gorard, 2013) based Gaussian noise data completion.…”
Section: Annsmentioning
confidence: 99%
“…Over-and underfitting performance were originally considered as a measure to select the best-performing two ANN types, since over-and underfitted ANNs are not capable to generalize appropriately on new data. Such networks either emulate the testing data in an overly exact ragged fashion (overfitting) or fail to react to each type of new data (underfitting) (Allamy, 2014;Zhang et al, 2018). The selection process was planned to be carried out via the analysis of R 2 performance.…”
Section: Network Metricsmentioning
confidence: 99%
“…If too few epochs are used, under-fitting can occur and the solutions can be of poor quality, where the model fits neither the training data nor the test data enough well. On the other hand, using too many epochs can cause over-fitting—the model fits the training data too well and thus fails to fit the test data enough well (does not have generalization capabilities), which prevents good performance on the test data [ 52 , 53 ]. When over-fitting occurs, the error on the training set continuously decreases with further model learning, and the error on the test set starts increasing.…”
Section: Multi-objective Evolutionary Instance Selection For Regrementioning
confidence: 99%
“…Figures 4(a) and 4(b) show some outlier values in two attributes “CLABSI: observed cases” and “patients who reported that their doctors sometimes or never communicated well,” respectively. DM models developed with outlier values yield very poor accuracy [59]. Therefore, we excluded hospitals that have outlier values from our experimental dataset, during data preparation processes ; this is done using the visual inspection method [60].…”
Section: Dm For a Clinical Surveillance Programmentioning
confidence: 99%
“…A brief description of these DM algorithms is explained as shown in Table 1. While developing our models, we took special care to avoid overfitting [59] of the models. It is important to note that the model building process was an iterative process.…”
Section: Dm For a Clinical Surveillance Programmentioning
confidence: 99%