2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA) 2015
DOI: 10.1109/icmla.2015.216
|View full text |Cite
|
Sign up to set email alerts
|

Source-Aware Partitioning for Robust Cross-Validation

Abstract: One of the most critical components of engineering a machine learning algorithm for a live application is robust performance assessment prior to its implementation.Crossvalidation is used to forecast a specific algorithm's classification or prediction accuracy on new input data given a finite dataset for training and testing the algorithm. Two most well known cross-validation techniques, random subsampling (RSS) and Kfold, are used to generalize the assessment results of machine learning algorithms in a non-ex… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 11 publications
0
3
0
Order By: Relevance
“…Hence the technique of cross validation was introduced which will help in determining the important parameters of an algorithm without using testing data. [13] Dashboards have been used from time to time in order to create interactive visualizations which help in analyzing data better. The process of decision making becomes data-driven when stakeholders can visualize data.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Hence the technique of cross validation was introduced which will help in determining the important parameters of an algorithm without using testing data. [13] Dashboards have been used from time to time in order to create interactive visualizations which help in analyzing data better. The process of decision making becomes data-driven when stakeholders can visualize data.…”
Section: Literature Reviewmentioning
confidence: 99%
“…However, this strategy has a limitation such that some samples may never be selected for validation while other samples may be selected repetitively leading to overlapping of validation subsets [163]. But with a significantly large number of iterations done, RSS is likely to achieve better results as k-fold validation [164].…”
Section: Age Estimation Evaluation Protocolsmentioning
confidence: 99%
“…The testing subset is used to validate the trained model using data samples not initially in validation and training subsets. Kiline and Uysal [164] proposed a technique of splitting the dataset with samples from specific subjects rotationally left out of training and validation sets. Budka and Gabrys [158] proposed a density-preserving sampling (DPS) technique that eliminates the need for repeating error estimation procedures by dividing the dataset into subsets that are guaranteed to be representative of the population the dataset is drawn from.…”
Section: Age Estimation Evaluation Protocolsmentioning
confidence: 99%