2022
DOI: 10.1162/netn_a_00212
|View full text |Cite
|
Sign up to set email alerts
|

Feeding the machine: Challenges to reproducible predictive modeling in resting-state connectomics

Abstract: In this critical review, we examine the application of predictive models, e.g. classifiers, trained using Machine Learning (ML) to assist in interpretation of functional neuroimaging data. Our primary goal is to summarize how ML is being applied and critically assess common practices. Our review covers 250 studies published using ML and resting-state functional MRI (fMRI) to infer various dimensions of the human functional connectome. Results for hold-out (“lockbox”) performance was, on average, ~13% less accu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
16
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 16 publications
(19 citation statements)
references
References 81 publications
1
16
0
Order By: Relevance
“…Importantly, according to the criteria of Cohen (1988) all observed effect sizes ( r ~ .2) can be considered as small and only few out of all calculated measures reached statistical significance, when correcting for multiple comparisons. These results contribute to the current debate about the effect size to be expected in investigations on brain-behavior relations (Marek et al, 2022; Rosenberg and Finn, 2022) in demonstrating that the combination of cross-validation (Sui et al, 2020; Cwiek et al, 2022) and multimodal analyses approaches can identify robust brain-behavior relations despite sample sizes that lie clearly below thousand. We compared three forms of analyses (explanation of intelligence scores, internal cross-validation, and out-of-sample prediction: prediction of intelligence scores in a replication sample with the model constructed in the main sample) to show, in line with Cwiek et al, (2022), that cross-validation reduces the overall effect size markedly as compared to explanation ( r = .31 to r = .22 and r = .23).…”
Section: Discussionmentioning
confidence: 61%
See 1 more Smart Citation
“…Importantly, according to the criteria of Cohen (1988) all observed effect sizes ( r ~ .2) can be considered as small and only few out of all calculated measures reached statistical significance, when correcting for multiple comparisons. These results contribute to the current debate about the effect size to be expected in investigations on brain-behavior relations (Marek et al, 2022; Rosenberg and Finn, 2022) in demonstrating that the combination of cross-validation (Sui et al, 2020; Cwiek et al, 2022) and multimodal analyses approaches can identify robust brain-behavior relations despite sample sizes that lie clearly below thousand. We compared three forms of analyses (explanation of intelligence scores, internal cross-validation, and out-of-sample prediction: prediction of intelligence scores in a replication sample with the model constructed in the main sample) to show, in line with Cwiek et al, (2022), that cross-validation reduces the overall effect size markedly as compared to explanation ( r = .31 to r = .22 and r = .23).…”
Section: Discussionmentioning
confidence: 61%
“…These results contribute to the current debate about the effect size to be expected in investigations on brain-behavior relations (Marek et al, 2022; Rosenberg and Finn, 2022) in demonstrating that the combination of cross-validation (Sui et al, 2020; Cwiek et al, 2022) and multimodal analyses approaches can identify robust brain-behavior relations despite sample sizes that lie clearly below thousand. We compared three forms of analyses (explanation of intelligence scores, internal cross-validation, and out-of-sample prediction: prediction of intelligence scores in a replication sample with the model constructed in the main sample) to show, in line with Cwiek et al, (2022), that cross-validation reduces the overall effect size markedly as compared to explanation ( r = .31 to r = .22 and r = .23). That internally cross-validated effect size reflects a more realistic estimate of the ‘true’ effect size (Yarkoni and Westfall, 2017) is supported by our out-of-sample-prediction in the independent sample ( r = .23).…”
Section: Discussionmentioning
confidence: 61%
“…Adding flexibility, predictive algorithms built on top of these large datasets typically involve a great number of investigator decisions -the combined effects of which undermine reliability of findings [for an example in connectivity modeling see Hallquist and Hillary, 2018]. Results of machine learning models, for example, are sensitive to model specification and parameter tuning [Pineau et al, 2021Bouthillier et al, 2019Cwiek et al, 2021]. Computational approaches permit systematically combing through a great number of potential variables of interest and their statistical relationships (specifically, at scales which would be manually infeasible).…”
Section: Big Data and Computational Methods As Friend And Foementioning
confidence: 99%
“…The strength of a hypothesis refers to how specific and how refutable it is ( Popper, 1963 ; see Table 1 for examples). We also argue for greater emphasis on testing and refuting strong hypotheses through a “team science” framework that allows us to address the heterogeneity in samples and/or methods that makes so many published findings tentative ( Cwiek et al, 2021 ; Bryan et al, 2021 ).…”
Section: Background and Motivationmentioning
confidence: 99%
“…Adding flexibility, predictive algorithms built on top of these large datasets typically involve a great number of investigator decisions – the combined effects of which undermine reliability of findings [for an example in connectivity modeling see Hallquist and Hillary, 2019 ]. Results of machine learning models, for example, are sensitive to model specification and parameter tuning ( Pineau, 2021 ; Bouthillier et al, 2019 ; Cwiek et al, 2021 ). Computational approaches permit systematically combing through a great number of potential variables of interest and their statistical relationships (specifically, at scales which would be manually infeasible).…”
Section: Background and Motivationmentioning
confidence: 99%