Deep neural network models of sound localization reveal how perception is adapted to real-world environments

Francl, Andrew; McDermott, Josh H.

doi:10.1101/2020.07.21.214486

Cited by 7 publications

(6 citation statements)

References 118 publications

(149 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…DNNs provide general-purpose architectures that can be optimized to perform challenging real-world tasks 13 . While DNNs are unlikely to fully achieve optimal performance, they might reveal the effects of optimizing a system under particular constraints 14,15 . Previous work has documented similarities between human and network behavior for neural networks trained on vision or hearing tasks [16][17][18] .…”

Section: Introductionmentioning

confidence: 99%

Deep neural network models reveal interplay of peripheral coding and stimulus statistics in pitch perception

Saddler

Gonzalez

McDermott

2020

Preprint

Self Cite

View full text Add to dashboard Cite

Computations on receptor responses enable behavior in the environment. Behavior is plausibly shaped by both the sensory receptors and the environments for which organisms are optimized, but their roles are often opaque. One classic example is pitch perception, whose properties are commonly linked to peripheral neural coding limits rather than environmental acoustic constraints. We trained artificial neural networks to estimate fundamental frequency from simulated cochlear representations of natural sounds. The best-performing networks replicated many characteristics of human pitch judgments. To probe how our ears and environment shape these characteristics, we optimized networks given altered cochleae or sound statistics. Human-like behavior emerged only when cochleae had high temporal fidelity and when models were optimized for natural sounds. The results suggest pitch perception is critically shaped by the constraints of natural environments in addition to those of the cochlea, illustrating the use of contemporary neural networks to reveal underpinnings of behavior.

show abstract

Section: Introductionmentioning

confidence: 99%

Deep neural network models reveal interplay of peripheral coding and stimulus statistics in pitch perception

Saddler

Gonzalez

McDermott

2020

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Here, we set out to discover the efficacy of DNNs as a means to infer proximate mechanisms of audition underlying a targeted auditory phenomena. We modelled binaural hearing (namely, in the context of signal detection), representative of a highly specialized system for which the computations within DNN analogues have barely been examined [11][12][13]. We found that a relatively shallow DNN was able to successfully utilize binaural discrepancies in auditory stimuli, as opposed to seeking an alternative strategy [29] and/or failing to exhibit binaural detection behavior.…”

Section: Deep Neural Network As Analogues Of the Auditory Systemmentioning

confidence: 99%

“…Further, binaural detection is a highly specialised auditory function for which deficits have real-world consequences 25,26 . DNNs may offer the opportunity to bridge this gap between animal and human data, and as yet, the inner workings of DNNs constructed to handle binaural audio have scarcely been considered [27][28][29] .…”

Section: Introductionmentioning

confidence: 99%

“…We know that specialized binaural mechanisms are able to utilize interaural time differences (ITDs) less than one ten thousandth of a second to improve the detection of a signal in noise [9,10]. Yet, to date, the inner workings of DNN analogues configured to handle binaural audio have scarcely been considered [11][12][13]. Here, we trained a DNN, with a modified autoencoder architecture, to imitate the phenomena of binaural signal detection.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Inferring the basis of binaural detection with a modified autoencoder

Smith

Sollini

Akeroyd

2021

Preprint

View full text Add to dashboard Cite

SUMMARYParallels have been reported between broad organization in the auditory system and optimized artificial neural networks1–3. It remains to be seen whether such promising analogies between the auditory system and deep learning models endure at other levels of description. Here, we examined whether artificial neural networks4,5 could offer a mechanistic account of human behavior in an auditory task. The chosen task promoted the use of binaural cues (across the ears) to help detect a signal in noise6,7. In the optimal network, we observed the emergence of specialized computations with prominent similarities to in vivo animal data8. Artificial neurons developed a sensitivity to temporal delays that increased hierarchically, and were widely distributed in preference (extending to delays beyond the range permitted by head width). Ensuing dynamics were consistent with a binaural cross-correlation mechanism9. Given the neural mechanisms of binaural detection in humans is contested9–13, these findings help to resolve this debate. Moreover, this is a primary demonstration that deep learning can infer tangible mechanisms underlying auditory perception.

show abstract

“…A body of recent research has pointed out similarities and differences between such deep neural networks and the primate visual system [1,2], but comparable auditory studies remained scarce [3,4]. A new article by Francl and McDermott [5] fills this gap, by revealing how properties of human audition can emerge in deep networks, and how such properties are causally dependent on features of the natural environment.…”

mentioning

confidence: 99%