Sound localization is the ability of humans to determine the source direction of sounds that they hear. Emulating this capability in virtual environments can have various societally relevant applications enabling more realistic virtual acoustics. We use a variety of artificial intelligence methods, such as machine learning via an Artificial Neural Network (ANN) model, to emulate human sound localization abilities. This paper addresses the particular challenge that the training and optimization of these models is very computationallyintensive when working with audio signal datasets. It describes the successful porting of our novel ANN model code for sound localization from limiting serial CPU-based systems to powerful, cutting-edge High-Performance Computing (HPC) resources to obtain significant speed-ups of the training and optimization process. Selected details of the code refactoring and HPC porting are described, such as adapting hyperparameter optimization algorithms to efficiently use the available HPC resources and replacing third-party libraries responsible for audio signal analysis and linear algebra. This study demonstrates that using innovative HPC systems at the Jülich Supercomputing Centre, equipped with high-tech Graphics Processing Unit (GPU) resources and based on the Modular Supercomputing Architecture, enables significant speed-ups and reduces the time-to-solution for sound localization from three days to three hours per ANN model.
The generation of a virtual, personal, auditory space to obtain a high-quality sound experience when using headphones is of great significance. Normally this experience is improved using personalized head-related transfer functions (HRTFs) that depend on a large degree of personal anthropometric information on pinnae. Most of the studies focus their personal auditory optimization analysis on the study of amplitude versus frequency on HRTFs, mainly in the search for significant elevation cues of frequency maps. Therefore, knowing the HRTFs of each individual is of considerable help to improve sound quality. The following work proposes a methodology to model HRTFs according to the individual structure of pinnae using multilayer perceptron and linear regression techniques. It is proposed to generate several models that allow knowing HRTFs amplitude for each frequency based on the personal anthropometric data on pinnae, the azimuth angle, and the elevation of the sound source, thus predicting frequency magnitudes. Experiments show that the prediction of new personal HRTF generates low errors, thus this model can be applied to new heads with different pinnae characteristics with high confidence. Improving the results obtained with the standard KEMAR pinna, usually used in cases where there is a lack of information.
The study of head-related transfer functions (HRTFs) is hampered by an inability to isolate the effects of individual anthropometric properties in an experimental setting. The authors propose to address this issue by studying the HRTFs of synthetic pinnæ which are designed to specification. The design, manufacture, and measurement of these synthetic pinnæ are detailed: A mold for each pinna is printed with a finite-deposition 3D printer and then cast in silicone with appropriate acoustic properties. These castings are then installed on a KEMAR mannequin whose HRTF is measured in an anechoic chamber. Additionally, comparative results are presented between commercially produced pinnæ inserts and copies produced via the proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.