2D direction of arrival estimation of multiple moving sources using a spherical microphone array

Moore, Alastair H.; Evers, Christine; Naylor, Patrick A.

doi:10.1109/eusipco.2016.7760442

Cited by 5 publications

(8 citation statements)

References 20 publications

(41 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Over the years, several approaches have been developed for the task of broadband DOA estimation. Some popular approaches are: i) subspace based approaches such as multiple signal classification (MUSIC) [1], [2], ii) time difference of arrival (TDOA) based approaches that use the family of generalized cross correlation (GCC) methods [3], [4], iii) generalizations of the cross-correlation methods such as steered response power with phase transform (SRP-PHAT) [5], and multichannel cross correlation coefficient (MCCC) [6], iv) adaptive multichannel time delay estimation using blind system identification based methods [7], v) probabilistic model based methods such as maximum likelihood method [8] and vi) methods based on histogram analysis of narrowband DOA estimates [9], [10]. These methods are generally formulated under the assumption of free-field propagation of sound waves, however in indoor acoustic environments this assumption is violated due to the presence of reverberation leading to severe degradation in their performance.…”

Section: Introductionmentioning

confidence: 99%

Multi-Speaker DOA Estimation Using Deep Convolutional Networks Trained With Noise Signals

Chakrabarty

Habets

2019

IEEE J. Sel. Top. Signal Process.

260

229

View full text Add to dashboard Cite

Supervised learning based methods for source localization, being data driven, can be adapted to different acoustic conditions via training and have been shown to be robust to adverse acoustic environments. In this paper, a convolutional neural network (CNN) based supervised learning method for estimating the direction-of-arrival (DOA) of multiple speakers is proposed. Multi-speaker DOA estimation is formulated as a multi-class multi-label classification problem, where the assignment of each DOA label to the input feature is treated as a separate binary classification problem. The phase component of the shorttime Fourier transform (STFT) coefficients of the received microphone signals are directly fed into the CNN, and the features for DOA estimation are learnt during training. Utilizing the assumption of disjoint speaker activity in the STFT domain, a novel method is proposed to train the CNN with synthesized noise signals. Through experimental evaluation with both simulated and measured acoustic impulse responses, the ability of the proposed DOA estimation approach to adapt to unseen acoustic conditions and its robustness to unseen noise type is demonstrated. Through additional empirical investigation, it is also shown that with an array of M microphones our proposed framework yields the best localization performance with M-1 convolution layers. The ability of the proposed method to accurately localize speakers in a dynamic acoustic scenario with varying number of sources is also shown.

show abstract

Section: Introductionmentioning

confidence: 99%

Multi-Speaker DOA Estimation Using Deep Convolutional Networks Trained With Noise Signals

Chakrabarty

Habets

2019

IEEE J. Sel. Top. Signal Process.

260

229

View full text Add to dashboard Cite

show abstract

“…We investigate the criteria under which smoothed histograms of PIVs and SSPIVs give accurate estimates of the DOAs of multiple sources in a noisy reverberant environment, including when sources are moving. Some of the first steps of an earlier version of the SSPIV method were presented in [13] and [29]. The current paper extends both the theoretical analysis and the evaluation of the PIV method compared to [8], especially in the context of multiple and moving speakers and in real-world applications.…”

Section: Introductionmentioning

confidence: 68%

“…The relative gain, 0 ≤ g ≤ 1, and phase, −π < γ ≤ π, of the second plane wave with the respect to the first give α 2 = gα 1 and β 2 = β 1 + γ. Therefore, from (29),…”

Section: Coherent Sourcesmentioning

confidence: 99%

“…We assume that β 1 and β 2 are independent with identical uniform distribution U(0, 2π) such that ∆β = β 1 − β 2 is a triangular distribution over the interval ∆β ∈ [−2π, 2π] which, due to periodicity of the phase, reduces to ∆β ∈ [−π, π] with probability p (∆β) = 1/(2π). The expected value ofĨ is obtained by integrating (29) with respect to ∆β, …”

Section: B Uncorrelated Sourcesmentioning

confidence: 99%

See 1 more Smart Citation

Direction of Arrival Estimation in the Spherical Harmonic Domain Using Subspace Pseudointensity Vectors

Moore

Evers

Naylor

2017

IEEE/ACM Trans. Audio Speech Lang. Process.

Self Cite

View full text Add to dashboard Cite

Abstract-Direction of Arrival (DOA) estimation is a fundamental problem in acoustic signal processing. It is used in a diverse range of applications, including spatial filtering, speech dereverberation, source separation and diarization. Intensity vector-based DOA estimation is attractive, especially for spherical sensor arrays, because it is computationally efficient. Two such methods are presented which operate on a spherical harmonic decomposition of a sound field observed using a spherical microphone array. The first uses Pseudo-Intensity Vectors (PIVs) and works well in acoustic environments where only one sound source is active at any time. The second uses Subspace Pseudo-Intensity Vectors (SSPIVs) and is targeted at environments where multiple simultaneous sources and significant levels of reverberation make the problem more challenging. Analytical models are used to quantify the effects of an interfering source, diffuse noise and sensor noise on PIVs and SSPIVs. The accuracy of DOA estimation using PIVs and SSPIVs is compared against the state-of-the-art in simulations including realistic reverberation and noise for single and multiple, stationary and moving sources. Finally, robust performance of the proposed methods is demonstrated using speech recordings in real acoustic environments.

show abstract

“…In DPD 7 using SMAs, the covariance matrix is approximated as the average covariance matrix over a local TF region 7,12 R(τ, k)…”

Section: Msecmentioning

confidence: 99%

Spatial consistency for multiple source direction-of-arrival estimation and source counting

Hafezi

Moore

Naylor

2019

The Journal of the Acoustical Society of America

Self Cite

View full text Add to dashboard Cite

A conventional approach to wideband Multi-Source (MS) Direction-of-Arrival (DOA) estimation is to perform Single Source (SS) DOA estimation in Time-Frequency (TF) bins for which a SS assumption is valid. The typical SS-validity confidence metrics analyse the validity of the SS assumption over a fixed-size TF region local to the TF bin. The performance of such methods degrades as the number of simultaneously active sources increases due to the associated decrease in the size of the TF regions where the SS assumption is valid. A SS-validity confidence metric is proposed that exploits a dynamic MS assumption over relatively larger TF regions. The proposed metric first clusters the initial DOA estimates (one per TF bin) and then uses the members' spatial consistency as well as its cluster's spread to weight each TF bin. Distance-based and density-based clustering are employed as two alternative approaches for clustering DOAs. A noise-robust density-based clustering is also used in an evolutionary framework to propose a method for source counting and source direction estimation. The evaluation results based on simulations and also with real recordings show that the proposed weighting strategy significantly improves the accuracy of source counting and MS DOA estimation compared to the state-of-the-art.

show abstract

2D direction of arrival estimation of multiple moving sources using a spherical microphone array

Cited by 5 publications

References 20 publications

Multi-Speaker DOA Estimation Using Deep Convolutional Networks Trained With Noise Signals

Multi-Speaker DOA Estimation Using Deep Convolutional Networks Trained With Noise Signals

Direction of Arrival Estimation in the Spherical Harmonic Domain Using Subspace Pseudointensity Vectors

Spatial consistency for multiple source direction-of-arrival estimation and source counting

Contact Info

Product

Resources

About