Monaural spectral features are important for human sound-source localization in sagittal planes, including front-back discrimination and elevation perception. These directional features result from the acoustic filtering of incoming sounds by the listener's morphology and are described by listener-specific head-related transfer functions (HRTFs). This article proposes a probabilistic, functional model of sagittal-plane localization that is based on human listeners' HRTFs. The model approximates spectral auditory processing, accounts for acoustic and non-acoustic listener specificity, allows for predictions beyond the median plane, and directly predicts psychoacoustic measures of localization performance. The predictive power of the listener-specific modeling approach was verified under various experimental conditions: The model predicted effects on localization performance of band limitation, spectral warping, non-individualized HRTFs, spectral resolution, spectral ripples, and high-frequency attenuation in speech. The functionalities of vital model components were evaluated and discussed in detail. Positive spectral gradient extraction, sensorimotor mapping, and binaural weighting of monaural spatial information were addressed in particular. Potential applications of the model include predictions of psychophysical effects, for instance, in the context of virtual acoustics or hearing assistive devices.
Studies of auditory looming bias have shown that sources increasing in intensity are more salient than sources decreasing in intensity. Researchers have argued that listeners are more sensitive to approaching sounds compared with receding sounds, reflecting an evolutionary pressure. However, these studies only manipulated overall sound intensity; therefore, it is unclear whether looming bias is truly a perceptual bias for changes in source distance, or only in sound intensity. Here we demonstrate both behavioral and neural correlates of looming bias without manipulating overall sound intensity. In natural environments, the pinnae induce spectral cues that give rise to a sense of externalization; when spectral cues are unnatural, sounds are perceived as closer to the listener. We manipulated the contrast of individually tailored spectral cues to create sounds of similar intensity but different naturalness. We confirmed that sounds were perceived as approaching when spectral contrast decreased, and perceived as receding when spectral contrast increased. We measured behavior and electroencephalography while listeners judged motion direction. Behavioral responses showed a looming bias in that responses were more consistent for sounds perceived as approaching than for sounds perceived as receding. In a control experiment, looming bias disappeared when spectral contrast changes were discontinuous, suggesting that perceived motion in distance and not distance itself was driving the bias. Neurally, looming bias was reflected in an asymmetry of late eventrelated potentials associated with motion evaluation. Hence, both our behavioral and neural findings support a generalization of the auditory looming bias, representing a perceptual preference for approaching auditory objects.auditory looming bias | electroencephalography | distance motion perception | sound externalization | head-related transfer functions I magine yourself alone in the wilderness. Suddenly, a threatening sound permeates the darkness. Is it approaching? This is a critical question when it comes to your survival because approaching objects usually pose a greater threat than receding objects (1). The phenomenon that approaching sounds are more salient than receding sounds is commonly termed "auditory looming bias." Looming bias is reflected in a broad variety of psychophysical tasks related to salience and alertness: bias in loudness-change estimates (2-4) and judgments of duration (5), improved discriminability of motion speed (6), underestimated distances for egocentrically moving (4) or bypassing sounds (7,8), and reduced reaction time for auditory (3, 9) and visual (3) targets preceded by looming sounds. In animals, looming biases result in faster learning speed during associative conditioning (10) and longer duration of attention (11). This list shows that looming bias triggers a variety of percepts across a wide range of psychoacoustic tasks. Despite its broad behavioral significance, the mechanisms underlying auditory looming bias are still poorly ...
The ability of sound-source localization in sagittal planes (along the top-down and front-back dimension) varies considerably across listeners. The directional acoustic spectral features, described by head-related transfer functions (HRTFs), also vary considerably across listeners, a consequence of the listener-specific shape of the ears. It is not clear whether the differences in localization ability result from differences in the encoding of directional information provided by the HRTFs, i.e., an acoustic factor, or from differences in auditory processing of those cues (e.g., spectral-shape sensitivity), i.e., non-acoustic factors. We addressed this issue by analyzing the listener-specific localization ability in terms of localization performance. Directional responses to spatially distributed broadband stimuli from 18 listeners were used. A model of sagittal-plane localization was fit individually for each listener by considering the actual localization performance, the listener-specific HRTFs representing the acoustic factor, and an uncertainty parameter representing the non-acoustic factors. The model was configured to simulate the condition of complete calibration of the listener to the tested HRTFs. Listener-specifically calibrated model predictions yielded correlations of, on average, 0.93 with the actual localization performance. Then, the model parameters representing the acoustic and non-acoustic factors were systematically permuted across the listener group. While the permutation of HRTFs affected the localization performance, the permutation of listener-specific uncertainty had a substantially larger impact. Our findings suggest that across-listener variability in sagittal-plane localization ability is only marginally determined by the acoustic factor, i.e., the quality of directional cues found in typical human HRTFs. Rather, the non-acoustic factors, supposed to represent the listeners' efficiency in processing directional cues, appear to be important.
Sound externalization, or the perception that a sound source is outside of the head, is an intriguing phenomenon that has long interested psychoacousticians. While previous reviews are available, the past few decades have produced a substantial amount of new data.In this review, we aim to synthesize those data and to summarize advances in our understanding of the phenomenon. We also discuss issues related to the definition and measurement of sound externalization and describe quantitative approaches that have been taken to predict the outcomes of externalization experiments. Last, sound externalization is of practical importance for many kinds of hearing technologies. Here, we touch on two examples, discussing the role of sound externalization in augmented/virtual reality systems and bringing attention to the somewhat overlooked issue of sound externalization in wearers of hearing aids.
The Auditory Modeling Toolbox (AMT) is a MATLAB/Octave toolbox for the development and application of computational auditory models with a particular focus on binaural hearing. The AMT aims for a consistent implementation of auditory models, well-structured in-code documentation, and inclusion of auditory data required to run the models. The motivation is to provide a toolbox able to reproduce the model predictions and allowing students and researchers to work with and to advance existing models. In the AMT, model implementations can be evaluated in two stages: by running so-called demonstrations, which are quick presentations of a model, and by starting so-called experiments aimed at reproducing results from the corresponding publications. Here, we describe the tools and mechanisms available within the framework of all AMT 1.x versions. The recently released AMT 1.1 includes over 60 models and is freely available as an open-source package from https://www.amtoolbox.org.
Listeners use monaural spectral cues to localize sound sources in sagittal planes (along the up-down and front-back directions). How sensorineural hearing loss affects the salience of monaural spectral cues is unclear. To simulate the effects of outer-hair-cell (OHC) dysfunction and the contribution of different auditory-nerve fiber types on localization performance, we incorporated a nonlinear model of the auditory periphery into a model of sagittal-plane sound localization for normal-hearing listeners. The localization model was first evaluated in its ability to predict the effects of spectral cue modifications for normal-hearing listeners. Then, we used it to simulate various degrees of OHC dysfunction applied to different types of auditory-nerve fibers. Predicted localization performance was hardly affected by mild OHC dysfunction but was strongly degraded in conditions involving severe and complete OHC dysfunction. These predictions resemble the usually observed degradation in localization performance induced by sensorineural hearing loss. Predicted localization performance was best when preserving fibers with medium spontaneous rates, which is particularly important in view of noise-induced hearing loss associated with degeneration of this fiber type. On average across listeners, predicted localization performance was strongly related to level discrimination sensitivity of auditory-nerve fibers, indicating an essential role of this coding property for localization accuracy in sagittal planes.
Vector-base amplitude panning (VBAP) aims at creating virtual sound sources at arbitrary directions within multichannel sound reproduction systems. However, VBAP does not consistently produce listener-specific monaural spectral cues that are essential for localization of sound sources in sagittal planes, including the front-back and up-down dimensions. In order to better understand the limitations of VBAP, a functional model approximating human processing of spectro-spatial information was applied to assess accuracy in sagittal-plane localization of virtual sources created by means of VBAP. First, we evaluated VBAP applied on two loudspeakers in the median plane, and then we investigated the directional dependence of the localization accuracy in several three-dimensional loudspeaker arrangements designed in layers of constant elevation. The model predicted a strong dependence on listeners' individual head-related transfer functions, on virtual source directions, and on loudspeaker arrangements. In general, the simulations showed a systematic degradation with increasing polar-angle span between neighboring loudspeakers. For the design of VBAP systems, predictions suggest that spans up to 40° polar angle yield a good trade-off between system complexity and localization accuracy. Special attention should be paid to the frontal region where listeners are most sensitive to deviating spectral cues. INTRODUCTIONVector-base amplitude panning (VBAP) is a method developed to create virtual sound sources at arbitrary directions by using a multichannel sound reproduction system [1]. VBAP determines loudspeaker gains by projecting the intended virtual source direction onto a basis formed by the directions of the most appropriate pair or triplet of neighboring loudspeakers. Within that pair or triplet, the loudspeaker signals are weighted in overall level. A problem of VBAP is associated with localization errors, that is, that virtual sources can be localized at directions deviating from the intended directions [2]. In this study we applied an auditory model in order to replicate [2] and investigate the limitations of VBAP with respect to sound localization.We use the interaural-polar coordinate system shown in Fig. 1 to distinguish different aspects of sound localization. In the lateral-angle dimension (left-right), VBAP introduces interaural differences in level (ILD) and time (ITD) and thus, perceptually relevant localization cues [3]. In the polar-angle dimension, monaural spectral features at high robert.baumgartner@oeaw.ac.at; piotr@majdak.com Localization in the polar-angle dimension (i.e., in sagittal planes) is based on a monaural learning process in which spectral features, that are characteristic for the listener's morphology, are related to certain directions [5]. Due to the monaural processing, the use of spectral features can be disrupted by spectral irregularities superimposed by the source spectrum [6]. The use of spectral features is limited to high frequencies (above around 0.7 kHz) because the s...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.