2018 International Workshop on Pattern Recognition in Neuroimaging (PRNI) 2018
DOI: 10.1109/prni.2018.8423961
|View full text |Cite
|
Sign up to set email alerts
|

Controlling a confound in predictive models with a test set minimizing its effect

Abstract: Predictive models applied on brain images can extract imaging biomarkers of pathologies or psychological traits. Yet, a successful prediction may be driven by a confounding effect that is correlated with the effect of interest. For instance fluid intelligence is strongly impacted by age; age is well predicted from brain images; hence successful prediction of fluid intelligence from brain images might have captured nothing more than a biomarker of aging. Here we introduce a nonparametric approach to control for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 14 publications
(17 citation statements)
references
References 21 publications
0
17
0
Order By: Relevance
“…Another possibility is to use various resampling or reweighting methods to create a dataset where the confounding variable is not related to the outcome (Pourhoseingholi et al 2012;Rao et al 2017;Chyzhyk et al 2018). Since only a subset of available subjects is used, this leads to data loss and highly variable estimates.…”
Section: Code Availabilitymentioning
confidence: 99%
“…Another possibility is to use various resampling or reweighting methods to create a dataset where the confounding variable is not related to the outcome (Pourhoseingholi et al 2012;Rao et al 2017;Chyzhyk et al 2018). Since only a subset of available subjects is used, this leads to data loss and highly variable estimates.…”
Section: Code Availabilitymentioning
confidence: 99%
“…We performed deconfounding using all subjects in our dataset, prior to matching and running our analysis. Note that in alternative settings, to prevent data leakage, only the training data should be used to build the deconfounding model, without considering the test data (Chyzhyk et al, 2018). However, the main reason behind our choice was that deconfounding in data points matched based on the same covariates may have little impact on the results (Linn et al, 2016).…”
Section: Methodsmentioning
confidence: 99%
“…Several studies have investigated ways to control for confounds and/or test whether they are likely driving relationships between biomarkers and outcomes. 26,111,128,133 Some helpful procedures include: (1) regressing out the confound within the cross-validation loop; this is important because doing this outside the loop might create dependence and lead to pessimistic performances 128 ; (2) testing whether a biomarker relates more strongly to the outcome of interest (eg, pain) than any potential co-occurring variables (eg, sleep loss or drug use); (3) testing the mediation between variables, eg, if a biomarker mediates the relationship between sleep loss and pain, it is related to pain even when controlling for sleep loss; (4) during training, identify biomarkers unrelated to co-occurring variables by stratifying samples and matching these on confounds; and (5) disaggregate some variables, such as sex, and test whether predictions are better within subgroups than across the whole population.…”
Section: Multivariate Pattern Analysis and Machine Learning Analysismentioning
confidence: 99%