The paper discusses the application of convolutional neural networks (CNNs) to minimum variance distortionless response (MVDR) localization schemes. We investigate the direction of arrival (DOA) estimation problem in noisy and reverberant conditions using an uniform linear array (ULA). CNNs are used to process the multichannel data from the ULA and to improve the data fusion scheme which is performed in the steered response power (SRP) computation. CNNs improve the incoherent frequency fusion of the narrowband response power by weighting the components, reducing the deleterious effects of those components affected by artifacts due to noise and reverberation. The use of CNNs avoids the necessity of previously encoding the multichannel data into selected acoustic cues with the advantage to exploit its ability in recognizing geometrical pattern similarity. Experiments with both simulated and real acoustic data demonstrate the superior localization performance of the proposed SRP beamformer with respect to other state-ofthe-art techniques.
Abstract-The steered response power (SRP) algorithms have been shown to be among the most effective and robust ones in noisy environments for direction of arrival (DOA) estimation. In broadband signal applications, the SRP methods typically perform their computations in the frequency-domain by applying a fast Fourier transform (FFT) on a signal portion, calculating the response power on each frequency bin, and subsequently fusing these estimates to obtain the final result. We introduce a frequency response incoherent fusion method based on a normalized arithmetic mean (NAM). Experiments are presented that rely on the SRP algorithms for the localization of motor vehicles in a noisy outdoor environment, focusing our discussion on performance differences with respect to different signal-tonoise ratios (SNR), and on spatial resolution issues for closely spaced sources. We demonstrate that the proposed fusion method provides higher resolution for the delay-and-sum SRP, and improved performances for minimum variance distortionless response (MVDR) and multiple signal classification (MUSIC).Index Terms-Broadband steered response power, incoherent frequency fusion, normalized arithmetic mean, direction of arrival estimation, microphone array.
The steered response power phase transform (SRP-PHAT) is a beamformer method very attractive in acoustic localization applications due to its robustness in reverberant environments. This paper presents a spatial grid design procedure, called the geometrically sampled grid (GSG), which aims at computing the spatial grid by taking into account the discrete sampling of time difference of arrival (TDOA) functions and the desired spatial resolution. A SRP-PHAT localization algorithm based on the GSG method is also introduced. The proposed method exploits the intersections of the discrete hyperboloids representing the TDOA information domain of the sensor array, and projects the whole TDOA information on the space search grid. The GSG method thus allows one to design the sampled spatial grid which represents the best search grid for a given sensor array, it allows one to perform a sensitivity analysis of the array and to characterize its spatial localization accuracy, and it may assist the system designer in the reconfiguration of the array. Experimental results using both simulated data and real recordings show that the localization accuracy is substantially improved both for high and for low spatial resolution, and that it is closely related to the proposed power response sensitivity measure.
In acoustic array processing, beamforming is a class of algorithms commonly used to estimate the position of a radiating sound source. This paper presents a diagonal unloading (DU) transformation method for the conventional response power beamforming to achieve robust localization with low computational complexity. The transformation is obtained by subtracting an opportune diagonal matrix from the covariance matrix of the array output vector. Specifically, the DU beamformer aims at subtracting the signal subspace from the noisy signal space. It is hence a data-dependent covariance matrix conditioning method. We show how to calculate precisely the unloading parameters, and we present a comparison of the proposed DU beamforming, the robust minimum variance distortionless response (MVDR) filter and the multiple signal classification (MUSIC) method, in terms of their respective eigenanalyses. Theoretical analysis and experiments conducted on both simulated and real acoustic data demonstrate that the DU beamformer localization performance is comparable to that of robust MVDR and MUSIC. Since its computational cost is equivalent to that of a conventional beamformer, the proposed DU beamformer method can thus be very attractive due to its effectiveness and computational efficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.