2013
DOI: 10.1016/j.csl.2012.09.003
|View full text |Cite
|
Sign up to set email alerts
|

Auditory inspired methods for localization of multiple concurrent speakers

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2014
2014
2016
2016

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 43 publications
0
5
0
Order By: Relevance
“…The second row in contrast to the first one applies position estimation, beamforming and enhancement. In [10] we study different possibilities for these blocks in a similar database and we draw the conclusion that the best performance is obtained when we use the PoPi position [5], convex-optimized beamforming [11,12] and vector Taylor series enhancement (VTS) [13]. Here, the VTS uses 128 Gaussians trained on a clean version of the Dev1 (the organizers provided us the impulse responses).…”
Section: Analysis Of the Full System 41 Experimental Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The second row in contrast to the first one applies position estimation, beamforming and enhancement. In [10] we study different possibilities for these blocks in a similar database and we draw the conclusion that the best performance is obtained when we use the PoPi position [5], convex-optimized beamforming [11,12] and vector Taylor series enhancement (VTS) [13]. Here, the VTS uses 128 Gaussians trained on a clean version of the Dev1 (the organizers provided us the impulse responses).…”
Section: Analysis Of the Full System 41 Experimental Resultsmentioning
confidence: 99%
“…This can be solved in different ways such as using the WLAN signal emitted by a device [7] or with video cameras [14]. Some literature has tried to estimate the speaker position inside of a room using a microphone array [5] or a microphone network [1,6]. The innovation of this paper is to localize the room using a microphone network.…”
Section: Introductionmentioning
confidence: 99%
“…In another method [24], speech signal has been divided into several bands and the SRP values have been calculated on these bands; then, the maximum value of each band has been considered in the localisation process. Also, Habib and Romsdorfer [25] proposes a ‘position‐pitch’‐based algorithm for the localisation and tracking of concurrent speakers. This algorithm uses a multi‐band gammatone filterbank and a frequency‐selective criterion that groups frequency channels belonging to the same speaker.…”
Section: Subband Processing‐based Speaker Localisationmentioning
confidence: 99%
“…However, its performance decreases in noisy and reverberant conditions as well as in multi-speaker scenarios. Different extensions have been proposed in [23][24][25][26][27] to increase the robustness of the algorithm in various aspects. The above stated subgrouping of the spectra (cf.…”
Section: Methods To Increase the Robustnessmentioning
confidence: 99%
“…In [22], a joint position and pitch (PoPi) estimation method has been proposed which is based on either cross-correlations or crosspower spectral densities (CPSDs). Several extensions have been proposed using cepstral weighting [23], gammatonelike weighting [24], time-domain GCC-PHAT replacement [25], particle filtering [26], and speaker-dependent subgrouping [27]. In [28], a different method based on a recurrent timing neural network is used for joint DOA and pitch estimation.…”
Section: Introductionmentioning
confidence: 99%