2022
DOI: 10.1101/2022.07.26.499576
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Accurate sex prediction of cisgender and transgender individuals without brain size bias

Abstract: Brain size differs substantially between human males and females. This difference in total intracranial volume (TIV) can cause bias when employing machine-learning approaches for the investigation of sex differences in brain morphology. TIV-biased models will likely not capture actual qualitative sex differences in brain organization but rather learn to classify an individual's sex based on brain size differences, thus leading to spurious and misleading conclusions, for example when comparing brain morphology … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1
1

Relationship

4
1

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 86 publications
0
4
0
Order By: Relevance
“…Firstly, by only considering biological sex, we neglected possible effects of gender on functional organization and its morphometric correlates. Findings may indeed appear more nuanced if we moved beyond the unrealistic assumption of a clear-cut sexual dimorphism of brain structure and function [55], as the relevance of considering transgender individuals in the study of sex differences is being increasingly recognized [56]. Nevertheless, we intentionally focused on the biological and dichotomous variable of sex assigned at birth given that our study aimed to study biological mechanisms relating to cortical morphometry.…”
Section: Discussionmentioning
confidence: 99%
“…Firstly, by only considering biological sex, we neglected possible effects of gender on functional organization and its morphometric correlates. Findings may indeed appear more nuanced if we moved beyond the unrealistic assumption of a clear-cut sexual dimorphism of brain structure and function [55], as the relevance of considering transgender individuals in the study of sex differences is being increasingly recognized [56]. Nevertheless, we intentionally focused on the biological and dichotomous variable of sex assigned at birth given that our study aimed to study biological mechanisms relating to cortical morphometry.…”
Section: Discussionmentioning
confidence: 99%
“…SVM is a supervised ML method that separates the data into distinct classes with the widest possible gap between these classes (Boser et al, 1992; Rafi & Shaikh, 2013; Vapnik, 1998; Zhang et al, 2021). Based on its operational principles regarding a supervised binary classification task and successful applications in previous sex classification studies (Flint et al, 2020; Weis et al, 2020; Wiersch et al, 2023), SVM is a suitable method for the present task. SVM models were built in Julearn (Hamdan et al, 2023; https://juaml.github.io/julearn/main/index.html) including a hyperparameter search nested within a 10–fold CV with five repetitions.…”
Section: Methodsmentioning
confidence: 99%
“…In a study to build a neuroimaging-based diagnostic classifier, the non-pathological ageing signal is confounding [2]. Confounding is ubiquitous and further examples include batch effects in genomics [3,4,5], scanner effects in neuroimaging [6], patient and process information in radiographs [7], and group differences like naturally different brain sizes in investigation of brain-size-independent sex differences [8,9]. Ignoring confounding effects in an ML application can render predictions untrustworthy and insights questionable [10] as this information can be exploited by learning algorithms [11] leading to spurious feature-target relationships [12], e.g., classification based on depression instead of ADHD or age instead of neuronal pathology.…”
Section: Introductionmentioning
confidence: 99%
“…Ignoring confounding effects in an ML application can render predictions untrustworthy and insights questionable [10] as this information can be exploited by learning algorithms [11] leading to spurious feature-target relationships [12], e.g., classification based on depression instead of ADHD or age instead of neuronal pathology. The benefits of big data in ML applications are obvious, especially when modeling weak relationships, but big data also leads to an increased risk of inducing confounded models [2,13,14,9]. Confounding, thus, is a crucial concern and if not properly treated can threaten real-world applicability of ML.…”
Section: Introductionmentioning
confidence: 99%