2008 Hands-Free Speech Communication and Microphone Arrays 2008
DOI: 10.1109/hscma.2008.4538694
|View full text |Cite
|
Sign up to set email alerts
|

Joint Position-Pitch Estimation for Multiple Speaker Scenarios

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
19
0

Year Published

2009
2009
2014
2014

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(19 citation statements)
references
References 7 publications
0
19
0
Order By: Relevance
“…Applying such a bank of FIR filters on a block of the observed signal, we get (16) where denotes the Hadamard product, and are the temporal and spatial filter lengths, respectively, is the th coefficient of the th filter in the filterbank, , , is a column vector of ones, and…”
Section: A Optimal Filterbanksmentioning
confidence: 99%
See 1 more Smart Citation
“…Applying such a bank of FIR filters on a block of the observed signal, we get (16) where denotes the Hadamard product, and are the temporal and spatial filter lengths, respectively, is the th coefficient of the th filter in the filterbank, , , is a column vector of ones, and…”
Section: A Optimal Filterbanksmentioning
confidence: 99%
“…It should also be noted that the DOA along with the pitch also are believed to be some of the governing factors that the human auditory system uses for separating sources. This line of reasoning has, quite recently, led to some joint DOA and fundamental frequency estimators, including maximum likelihood based [10], [11], subspace-based [12]- [14], correlation-based [15], [16], and filtering-based [17]- [19] methods. Notably, the problem of joint DOA and fundamental frequency estimation was formalized and thoroughly analyzed in [10], and a maximum likelihood estimator that achieves the highest possible accuracy (under certain conditions) was proposed.…”
Section: Introductionmentioning
confidence: 99%
“…We have taken the CPSD-based method proposed in [22] combined with cepstral weighting [23], gammatone-like weighting [24], and a subsequent particle filtering [26] as the core algorithm, and we propose several extensions to improve both accuracy and robustness in this paper. As a first extension, a frequency-domain comb filter is introduced to improve the performance for simultaneously active speakers.…”
Section: Introductionmentioning
confidence: 99%
“…In [22], a joint position and pitch (PoPi) estimation method has been proposed which is based on either cross-correlations or crosspower spectral densities (CPSDs). Several extensions have been proposed using cepstral weighting [23], gammatonelike weighting [24], time-domain GCC-PHAT replacement [25], particle filtering [26], and speaker-dependent subgrouping [27]. In [28], a different method based on a recurrent timing neural network is used for joint DOA and pitch estimation.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation