The extraction of multiple Direction-of-Arrival (DoA) information from estimated spatial spectra can be challenging when such spectra are noisy or the sources are adjacent. Smoothing or clustering techniques are typically used to remove the effect of noise or irregular peaks in the spatial spectra. As we will explain and show in this paper, the smoothing-based techniques require prior knowledge of minimum angular separation of the sources and the clusteringbased techniques fail on noisy spatial spectrum. A broad class of localization techniques give direction estimates in each Time Frequency (TF) bin. Using this information as input, a novel technique for obtaining robust localization of multiple simultaneous sources is proposed using Estimation Consistency (EC) in the TF domain. The method is evaluated in the context of spherical microphone arrays. This technique does not require prior knowledge of the sources and by removing the noise in the estimated spatial spectrum makes clustering a reliable and robust technique for multiple DoA extraction from estimated spatial spectra. The results indicate that the proposed technique has the strongest robustness to separation with up to 10 • median error for 5 • to 180 • separation for 2 and 3 sources, compared to the baseline and the state-of-the-art techniques.
Abstract-Pseudointensity vectors (PIVs) provide a means of Direction of Arrival (DOA) estimation for Spherical Microphone Arrays (SMAs) using only the zeroth and the first-order spherical harmonics. An Augmented Intensity Vector (AIV) is proposed which improves the accuracy of PIVs by exploiting higher order spherical harmonics. We compared DOA estimation using our proposed AIVs against PIVs, Steered Response Power (SRP) and subspace methods where the number of sources, their angular separation, the reverberation time of the room and the sensor noise level are varied. The results show that the proposed approach outperforms the baseline methods and performs at least as accurately as the state-of-the-art method with strong robustness to reverberation, sensor noise and number of sources. In the single and multiple source scenarios tested, which include realistic levels of reverberation and noise, the proposed method had average error of 1.5°and 2°, respectively.
Multiple source localization is an important task in acoustic signal processing with applications including dereverberation, source separation, source tracking and environment mapping. When using spherical microphone arrays, it has been previously shown that Pseudo-intensity Vectors (PIV), and Augmented Intensity Vectors (AIV), are an effective approach for direction of arrival estimation of a sound source. In this paper, we evaluate AIV-based localization in acoustic scenarios involving multiple sound sources. Simulations are conducted where the number of sources, their angular separation and the reverberation time of the room are varied. The results indicate that AIV outperforms PIV and Steered Response Power (SRP) with an average accuracy between 5 and 10 degrees for sources with angular separation of 30 degrees or more. AIV also shows better robustness to reverberation time than PIV and SRP.
Spectral masking is when the threshold of audibility for one sound is raised by the simultaneous presence of another sound. In multitrack music production, this results in less ability to fully hear and distinguish the sound sources in the mix. We design a simplified measure of masking based on best practices in sound engineering. We implement both off-line and realtime, low latency autonomous multitrack equalization systems to reduce masking in multitrack audio. We perform objective measurement of the spectral masking in the resultant mixes and conduct a listening test for subjective comparison between the mix results of different implementations of our system, a raw mix, and manual mixes made by an amateur and a professional mix engineer. The results show that autonomous systems reduce both the perceived masking and objective spectral masking and improve the overall quality of the mix. We show that our offline semi-autonomous system is capable of improving the raw mix better than an amateur and close to a professional mix by simply controlling one user parameter. Our results also suggest that existing objective measures of masking are ill-suited for quantifying perceived masking in multitrack musical audio.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.