2023
DOI: 10.1371/journal.pbio.3002366
|View full text |Cite
|
Sign up to set email alerts
|

Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions

Greta Tuckute,
Jenelle Feather,
Dana Boebinger
et al.

Abstract: Models that predict brain responses to stimuli provide one measure of understanding of a sensory system and have many potential applications in science and engineering. Deep artificial neural networks have emerged as the leading such predictive models of the visual system but are less explored in audition. Prior work provided examples of audio-trained neural networks that produced good predictions of auditory cortical fMRI responses and exhibited correspondence between model stages and brain regions, but left … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
7
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 148 publications
0
7
0
Order By: Relevance
“…What all of these representation types have in common is that similarity relations between representations are characterized in terms of cosines. This applies to a large variety of machine learning models, not just those used for words and still images, but also dynamic stimuli like audio (Kell et al, 2018;Tuckute et al, 2023) and video (Lotter et al, 2017). C2L thus enables cognitive models to be applied to the increasingly complex and naturalistic items that machine learning models will be able to process.…”
Section: Other Sources Of Similarity Informationmentioning
confidence: 99%
“…What all of these representation types have in common is that similarity relations between representations are characterized in terms of cosines. This applies to a large variety of machine learning models, not just those used for words and still images, but also dynamic stimuli like audio (Kell et al, 2018;Tuckute et al, 2023) and video (Lotter et al, 2017). C2L thus enables cognitive models to be applied to the increasingly complex and naturalistic items that machine learning models will be able to process.…”
Section: Other Sources Of Similarity Informationmentioning
confidence: 99%
“…In recent years, deep neural networks (DNNs) have emerged as a powerful tool for representing complex visual data, such as images (LeCun et al, 2015) or videos (Liu et al, 2020). In the auditory domain, DNNs have been shown to provide valuable representations-so-called feature or latent spaces-for modeling the cerebral processing of sound (brain encoding) (speech: Kell et al, 2018;Millet et al, 2022;Tuckute & Feather, 2023; semantic content: Caucheteux et 3 al., 2023;Giordano et al, 2023;music: Güçlü et al, 2016), or reconstructing the stimuli listened by a participant (brain decoding) (Akbari et al, 2019). They have not yet been used to explain cerebral representations of identity-related information due in part to the focus on speech information (von Kriegstein et al, 2003).…”
Section: Introductionmentioning
confidence: 99%
“…We addressed this question by using representational similarity analysis (RSA; Kriegeskorte et al, 2008) to test which model better accounts for the representational geometry for voice identities in the auditory cortex. Using RSA as a model comparison framework is relevant to examining the brain-model relationship from complementary angles (Diedrichsen & Kriegeskorte, 2017;Giordano et al, 2023;Tuckute & Feather, 2023). We built speaker x speaker representational dissimilarity matrices (RDMs) capturing pairwise differences in cerebral activity or model predictions between all pairs of speakers; then, we examined how well the LIN and VLS-derived RDMs correlated with the cerebral RDMs from A1 and the TVAs.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations