Quantifying performance of machine learning methods for neuroimaging data

Jollans, Lee; Boyle, Rory; Artiges, Éric; Banaschewski, Tobias; Desrivières, Sylvane; Grigis, Antoine; Martinot, Jean Luc; Paus, Tomáš; Smolka, Michael N.; Walter, Henrik; Schumann, Günter; Garavan, Hugh; Whelan, Robert

doi:10.1016/j.neuroimage.2019.05.082

Cited by 142 publications

(120 citation statements)

References 78 publications

Supporting

Mentioning

102

Contrasting

Order By: Relevance

“…Elastic-net regression with nested 5-fold cross-validation was used to predict each of the nIDPs. This approach is widely-used and has been shown to achieve a robust and state-of-the-art performance in many neuroimaging studies 24,25 . Pearson correlation between each of the predicted and the true nIDPs in the outer test fold is used to quantify accuracy.…”

Section: Resultsmentioning

confidence: 99%

“…Elastic-net regression, from the glmnet package 48 , was used to predict the nIDPs using FLICA’s subject modes as model regressors (features). This approach is widely-used and has been shown to achieve a robust and state-of-the-art performance in many neuroimaging studies 24,25 . To evaluate the model performance, for each nIDP, we used 5-fold cross validation, and compute Pearson correlation between the predicted and true values of each nIDP across the 5 test sets.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Phenotype Discovery from Population Brain Imaging

Gong

Beckmann

Smith

2020

Preprint

View full text Add to dashboard Cite

Neuroimaging allows for the non-invasive study of the brain in rich detail. Data-driven discovery of patterns of population variability in the brain has the potential to be extremely valuable for early disease diagnosis and understanding the brain. The resulting patterns can be used as imaging-derived phenotypes (IDPs), and may complement existing expert-curated IDPs. However, population datasets, comprising many different structural and functional imaging modalities from thousands of subjects, provide a computational challenge not previously addressed. Here, for the first time, a multimodal independent component analysis approach is presented that is scalable for data fusion of voxel-level neuroimaging data in the full UK Biobank (UKB) dataset, that will soon reach 100,000 imaged subjects. This new computational approach can estimate modes of population variability that enhance the ability to predict thousands of phenotypic and behavioural variables using data from UKB and the Human Connectome Project. A high-dimensional decomposition achieved improved predictive power compared with widely-used analysis strategies, single-modality decompositions and existing IDPs. In UKB data (14,503 subjects with 47 different data modalities), many interpretable associations with non-imaging phenotypes were identified, including multimodal spatial maps related to fluid intelligence, handedness and disease, in some cases where IDP-based approaches failed. Introduction1 Large-scale multimodal brain imaging has enormous potential for boosting epidemiological and neu-2 roscientific studies, generating markers for early disease diagnosis and prediction of disease progres-3 sion, and the understanding of human cognition, by means of linking to clinical or behavioural vari-4 ables. Recent major studies have been acquiring brain magnetic resonance imaging (MRI), genetics and 5 demographic/behavioural data from large cohorts. Examples are the UK Biobank (UKB) 1 , the Human 6 Connectome Project (HCP) 2 and the Adolescent Brain Cognitive Development (ABCD) study 3 . These 7 studies involve multimodal data, meaning that several distinct types of MRI data are acquired, mapping 8 activity, functional networks, structural connectivity, white matter microstructure, and organisation and 9 volumes of different brain tissues and sub-structures 1 . However, the multimodal, high-dimensional and 10 noisy nature of such big datasets makes many existing analytical approaches for extracting interpretable 11 information impractical 4 . 12Traditionally, large-scale neuroimaging studies first summarize the imaging data into interpretable 13 image-derived phenotypes (IDPs) 1, 5 , which are scalar quantities derived from raw imaging data (e.g., 14 regional volumes from structural MRI, mean task activations from task MRI, resting-state functional 15 connectivities between brain parcels). This knowledge-based approach is simple and efficient, and ef-16 fectively reduces the high-dimensional data into interpretable, compact, convenient features. However, 17...

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Phenotype Discovery from Population Brain Imaging

Gong

Beckmann

Smith

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…All analyses were carried out in MATLAB 2018b (scripts available at https://github.com/ljollans/multimodal_AUDIT_prediction). Given the relatively small sample size and large number of features for the BID and SACE task, standard multiple regression paired with bootstrap aggregation (bagging) was selected for all analyses (39), based on a previous empirical evaluation of the utility of various linear regression methods for prediction with neuroimaging data (31).…”

Section: Analysesmentioning

confidence: 99%

Predicting future drinking among young adults: using ensemble machine-learning to combine MRI with psychometrics and behaviour

Groefsema

Luijten

Engels

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

Background While most research into predictors of problematic alcohol use has focused on adolescence, young adults are also at elevated risk, and differ from adolescents and adults in terms of exposure to alcohol and neurodevelopment. Here we examined predictors of alcohol use among young adults at a 1-year follow-up using a broad predictive modelling approach. Methods Data in four modalities were included from 128 men aged between 18 and 25 years; functional MRI regions-ofinterest from 1) a beer-incentive delay task, and 2) a social alcohol cue-exposure task, 3) grey matter data, and 4) non-neuroimaging data (i.e. psychometric and behavioural). These modalities were combined into an ensemble model to predict follow-up Alcohol Use Disorder Identification (AUDIT) scores, and were tested separately for their contribution. To reveal specificity for the prediction of future AUDIT scores, the same analyses were carried out for current AUDIT score. Results The ensemble resulted in a more accurate estimation of follow-up AUDIT score than any single modality. Only removal of the social alcohol cue-exposure task and of the non-neuroimaging data significantly worsened predictions. Reporting to need a drink in the morning to start the day was the strongest unique predictor of future drinking along with anterior cingulate cortex and cerebellar activity. Conclusions Alcohol-related task fMRI activity is a valuable predictor for future drinking among young adults alongside non-neuroimaging variables. Multi-modal prediction models best predict future drinking among young adults and may play an important part in the move towards individualized treatment and prevention efforts.

show abstract

“…A third way is to decrease the influence of the possible noise in rs-fMRI signal, for instance, global signal regression and motion artifact correction (Nielsen et al, 2019) have been reported to advance the RSFC-behavior prediction. The last way is to use the bagging strategy (Breiman, 1996) to improve the prediction with RSFC (Jollans et al, 2019).…”

Section: Bootstrapping Enhanced the Rsfc-phenotype Associationsmentioning

confidence: 99%

Bootstrapping promotes the RSFC-behavior associations: an application of individual cognitive traits prediction

Wei

Jing

2019

Preprint

View full text Add to dashboard Cite

Resting state functional connectivity records enormous functional interaction information between any pair of brain nodes, which enriches the prediction of individual phenotypes. To reduce the high dimensional features in prediction, correlation analysis is a common way for feature selection. However, rs-fMRI signal exhibits typically low signal-to-noise ratio and correlation analysis is sensitive to outliers and data distribution, which may bring unstable and uninformative features to subsequent prediction. To alleviate this problem, a bootstrapping-based feature selection framework was proposed and applied on three widely used regression models: connectome-based predictive model (CPM), support vector regression (SVR) and least absolute shrinkage and selection operator (LASSO). A large open-source dataset from Human Connectome Project (HCP) was adopted in the study and a series of cognitive traits were acted as the prediction targets. To systematically investigate the influences of different parameter settings on the bootstrapping-based framework, a total of 216 parameter combinations were evaluated through the R value between the predicted and real cognitive traits, and the best identified performance among them was chosen out as the final prediction accuracy for each cognitive trait. By using bootstrapping without replacement, the best performances of CPM with positive and negative feature sets, SVR and LASSO averagely increased by 28.0%, 33.2%, 11.6% and 24.3% in R values in contrast to the baseline method without bootstrapping. By using bootstrapping with replacement, these best performances increased by 22.1%, 22.9%, 9.4% and 19.6%. Furthermore, the bootstrapping-based feature selection methods could effectively refine the original feature sets obtained from correlation analysis, which thus retained the more stable and informative feature sets. The results demonstrate that bootstrapping-based feature selection is an easy-to-use and effective method to improve RSFC prediction of cognitive traits and is highly recommended in future RSFC prediction studies.

show abstract

Quantifying performance of machine learning methods for neuroimaging data

Cited by 142 publications

References 78 publications

Phenotype Discovery from Population Brain Imaging

Phenotype Discovery from Population Brain Imaging

Predicting future drinking among young adults: using ensemble machine-learning to combine MRI with psychometrics and behaviour

Bootstrapping promotes the RSFC-behavior associations: an application of individual cognitive traits prediction

Contact Info

Product

Resources

About