The auditory system converts the physical properties of a sound waveform to neural activities and processes them for recognition. During the process, the tuning to amplitude modulation (AM) is successively transformed by a cascade of brain regions. To test the functional significance of the AM tuning, we conducted single-unit recording in a deep neural network (DNN) trained for natural sound recognition. We calculated the AM representation in the DNN and quantitatively compared it with those reported in previous neurophysiological studies. We found that an auditory-system-like AM tuning emerges in the optimized DNN. Better-recognizing models showed greater similarity to the auditory system. We isolated the factors forming the AM representation in the different brain regions. Because the model was not designed to reproduce any anatomical or physiological properties of the auditory system other than the cascading architecture, the observed similarity suggests that the AM tuning in the auditory system might also be an emergent property for natural sound recognition during evolution and development. SIGNIFICANCE STATEMENT This study suggests that neural tuning to amplitude modulation may be a consequence of the auditory system evolving for natural sound recognition. We modeled the function of the entire auditory system; that is, recognizing sounds from raw waveforms with as few anatomical or physiological assumptions as possible. We analyzed the model using single-unit recording, which enabled a fair comparison with neurophysiological data with as few methodological biases as possible. Interestingly, our results imply that frequency decomposition in the inner ear might not be necessary for processing amplitude modulation. This implication could not have been obtained if we had used a model that assumes frequency decomposition.
Tonotopy is an essential functional organization in the mammalian auditory cortex, and its source in the primary auditory cortex (A1) is the incoming frequency-related topographical projections from the ventral division of the medial geniculate body (MGv). However, circuits that relay this functional organization to higher-order regions such as the secondary auditory field (A2) have yet to be identified. Here, we discovered a new pathway that projects directly from MGv to A2 in mice. Tonotopy was established in A2 even when primary fields including A1 were removed, which indicates that tonotopy in A2 can be established solely by thalamic input. Moreover, the structural nature of differing thalamocortical connections was consistent with the functional organization of the target regions in the auditory cortex. Retrograde tracing revealed that the region of MGv input to a local area in A2 was broader than the region of MGv input to A1. Consistent with this anatomy, two-photon calcium imaging revealed that neuronal responses in the thalamocortical recipient layer of A2 showed wider bandwidth and greater heterogeneity of the best frequency distribution than those of A1. The current study demonstrates a new thalamocortical pathway that relays frequency information to A2 on the basis of the MGv compartmentalization.
Our daily activities require vigilance. Therefore, it is useful to externally monitor and predict our vigilance level using a straightforward method. It is known that the vigilance level is linked to pupillary fluctuations via Locus Coeruleus and Norepinephrine (LC-NE) system. However, previous methods of estimating long-term vigilance require monitoring pupillary fluctuations at rest over a long period. We developed a method of predicting the short-term vigilance level by monitoring pupillary fluctuation for a shorter period consisting of several seconds. The LC activity also fluctuates at a timescale of seconds. Therefore, we hypothesized that the short-term vigilance level could be estimated using pupillary fluctuations in a short period and quantified their amplitude as the Micro-Pupillary Unrest Index (M-PUI). We found an intra-individual trial-by-trial positive correlation between Reaction Time (RT) reflecting the short-term vigilance level and M-PUI in the period immediately before the target onset in a Psychomotor Vigilance Task (PVT). This relationship was most evident when the fluctuation was smoothed by a Hanning window of approximately 50 to 100 ms (including cases of down-sampled data at 100 and 50 Hz), and M-PUI was calculated in the period up to one or two seconds before the target onset. These results suggest that M-PUI can monitor and predict fluctuating levels of vigilance. M-PUI is also useful for examining pupillary fluctuations in a short period for elucidating the psychophysiological mechanisms of short-term vigilance.
Sustained attention plays an important role in adaptive behaviours in everyday activities. As previous studies have mostly focused on vision, and attentional resources have been thought to be specific to sensory modalities, it is still unclear how mechanisms of attentional fluctuations overlap between visual and auditory modalities. To reduce the effects of sudden stimulus onsets, we developed a new gradual-onset continuous performance task (gradCPT) in the auditory domain and compared dynamic fluctuation of sustained attention in vision and audition. In the auditory gradCPT, participants were instructed to listen to a stream of narrations and judge the gender of each narration. In the visual gradCPT, they were asked to observe a stream of scenery images and indicate whether the scene was a city or mountain. Our within-individual comparison revealed that auditory and visual attention are similar in terms of the false alarm rate and dynamic properties including fluctuation frequency. Absolute time scales of the fluctuation in the two modalities were comparable, notwithstanding the difference in stimulus onset asynchrony. The results suggest that fluctuations of visual and auditory attention are underpinned by common principles and support models with a more central, modality-general controller.
Sparse coding and its related theories have been successful to explain various response properties of early stages of sensory information processing such as primary visual cortex and peripheral auditory system, which suggests that the emergence of such properties results from adaptation of the nerve system to natural stimuli. The present study continues this line of research in a higher stage of auditory processing, focusing on harmonic structures that are often found in behaviourally important natural sound like animal vocalization. It has been physiologically shown that monkey primary auditory cortices (A1) have neurons with response properties capturing such harmonic structures: their response and modulation peaks are often found at frequencies that are harmonically related to each other. We hypothesize that such relations emerge from sparse coding of harmonic natural sounds. Our simulation shows that similar harmonic relations emerge from frequency-domain sparse codes of harmonic sounds, namely, piano performance and human speech. Moreover, the modulatory behaviours can be explained by competitive interactions of model neurons that capture partially common harmonic structures.
Natural sounds contain rich patterns of amplitude modulation (AM), which is one of the essential sound dimensions for auditory perception. The sensitivity of human hearing to AM measured by psychophysics takes diverse forms depending on the experimental conditions. Here, we address with a single framework the questions of why such patterns of AM sensitivity have emerged in the human auditory system and how they are realized by our neural mechanisms. Assuming that optimization for natural sound recognition has taken place during human evolution and development, we examined its effect on the formation of AM sensitivity by optimizing a computational model, specifically, a multi-layer neural network, for natural sound (namely, everyday sounds and speech sounds) recognition and simulating psychophysical experiments in which the model’s AM sensitivity was assessed. Relatively higher layers in the model optimized to sounds with natural AM statistics exhibited AM sensitivity similar to that of humans, even though the model was not designed to reproduce human-like AM sensitivity. Moreover, simulated neurophysiological experiments on the model revealed a correspondence between the model layers and the auditory brain regions. The layers in which human-like psychophysical AM sensitivity emerged exhibited substantial neurophysiological similarity with the auditory midbrain and higher regions. These results suggest that human behavioral AM sensitivity has emerged as a result of optimization for natural sound recognition in the course of our evolution and/or development and that it is based on a stimulus representation encoded in the neural firing rates in the auditory midbrain and higher regions.SIGNIFICANCE STATEMENT:This study provides a computational paradigm to bridge the gap between the behavioral properties of human sensory systems as measured in psychophysics and neural representations as measured in non-human neurophysiology. This was accomplished by combining the knowledge and techniques in psychophysics, neurophysiology, and machine learning. As a specific target modality, we focused on the auditory sensitivity to sound amplitude modulation (AM). We built an artificial neural network model that performs natural sound recognition and simulated psychophysical and neurophysiological experiments in the model. Quantitative comparison of a machine learning model with human and non-human data made it possible to integrate the knowledge of behavioral AM sensitivity and neural AM tunings from the perspective of optimization to natural sound recognition.
Attention levels fluctuate during the course of daily activities. However, factors underlying sustained attention are still unknown. We investigated mechanisms of sustained attention using psychological, neuroimaging, and neurochemical approaches. Participants were scanned with functional magnetic resonance imaging (fMRI) while performing gradual-onset, continuous performance tasks (gradCPTs). In gradCPTs, narrations or visual scenes gradually changed from one to the next. Participants pressed a button for frequent Go trials as quickly as possible and withheld responses to infrequent No-go trials. Performance was better for the visual gradCPT than for the auditory gradCPT, but the 2 were correlated. The dorsal attention network was activated during intermittent responses, regardless of sensory modality. Reaction-time variability of gradCPTs was correlated with signal changes (SCs) in the left fronto-parietal regions. We also used magnetic resonance spectroscopy (MRS) to measure levels of glutamate–glutamine (Glx) and γ-aminobutyric acid (GABA) in the left prefrontal cortex (PFC). Glx levels were associated with performance under undemanding situations, whereas GABA levels were related to performance under demanding situations. Combined fMRI–MRS results demonstrated that SCs of the left PFC were positively correlated with neurometabolite levels. These findings suggest that a neural balance between excitation and inhibition is involved in attentional fluctuations and brain dynamics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.