2017
DOI: 10.1093/gigascience/gix020
|View full text |Cite
|
Sign up to set email alerts
|

Using and understanding cross-validation strategies. Perspectives on Saeb et al.

Abstract: This three-part review takes a detailed look at the complexities of cross-validation, fostered by the peer review of Saeb et al.’s paper entitled “The need to approximate the use-case in clinical machine learning.” It contains perspectives by reviewers and by the original authors that touch upon cross-validation: the suitability of different strategies and their interpretation.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
89
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 107 publications
(91 citation statements)
references
References 16 publications
2
89
0
Order By: Relevance
“…In both the train and test set the ridge regression was the most predictive model (post-hoc analysis will show how model prediction was affected by our sample size). The test set may be lower because cross-validation is not a panacea for overfitting as cross-validation still capitalizes on stochastic error 32 .…”
Section: Phenotypic Predictionmentioning
confidence: 99%
“…In both the train and test set the ridge regression was the most predictive model (post-hoc analysis will show how model prediction was affected by our sample size). The test set may be lower because cross-validation is not a panacea for overfitting as cross-validation still capitalizes on stochastic error 32 .…”
Section: Phenotypic Predictionmentioning
confidence: 99%
“…The pdfs shown use different band‐widths, each computed from the associated delimiter placements through multiple iterations of leave‐subject‐out Monte Carlo cross‐validation (CV), utilizing a train‐test split of 90% to 10%. Leave‐subject‐out CV is an established blocked CV approach with theoretic optimality that accounts for dependencies within subject responses [XH12, SLJ∗17, RBC∗17, LVS∗17]. Peaks in the resulting pdfs highlight consistencies across participants' placed delimiters.…”
Section: Resultsmentioning
confidence: 99%
“…Unfortunately, in this case, we do not believe leave-one-out subject-wise cross validation does not address identity confounding [23], although it is commonly used for this purpose with health care-related data. We believe identity confounding is best addressed by larger data sets representing more diversity in data requiring significant improvement in how current clinical data collection is achieved.…”
Section: Discussionmentioning
confidence: 99%
“…These recordings were chosen at random from each subject. We did not choose to use leave-one-subject-out cross validation in order to incorporate more of the data due to the concerns of within subject variation in data, which is well known in similar data sets [23]. Particularly with voice features, there tends to be large variation in features within subjects which violates the primary assumption of within subject consistency at the core of leave-one-subject-out cross validation.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation