2014
DOI: 10.1007/978-3-319-11397-5_7
|View full text |Cite
|
Sign up to set email alerts
|

Robust Speaker Recognition Using MAP Estimation of Additive Noise in i-vectors Space

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2015
2015
2018
2018

Publication Types

Select...
5
2

Relationship

3
4

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 7 publications
0
8
0
Order By: Relevance
“…In our previous work [24][25][26], we proposed an additive noise model in the i-vector space represented by the equation:…”
Section: The Plda Model For I-vector Scoringmentioning
confidence: 99%
See 1 more Smart Citation
“…In our previous work [24][25][26], we proposed an additive noise model in the i-vector space represented by the equation:…”
Section: The Plda Model For I-vector Scoringmentioning
confidence: 99%
“…In this paper, we explore two axes. On one side, we aim at improving the system performance by using two different techniques: 1 -The I-MAP algorithm [24][25][26] which is an i-vector denoising procedure based on an additive noise model in the i-vector space. It uses a Gaussian modeling of both clean i-vectors and the noise distributions in the i-vector space and have been proven to yield up to 60% of relative EER improvement compared to a baseline system performance.…”
Section: Introductionmentioning
confidence: 99%
“…This paper is an extension of our work in [22] where we proposed an i-vector "denoising" technique, we called i-MAP, in order to deal with additive noise.…”
Section: Introductionmentioning
confidence: 99%
“…The noise and reverberation levels are frequently selected independently from each other without a specific application in mind, e.g., [20,21], therefore some of them might never happen in real life. Furthermore, they are often selected within a discrete set of values, e.g., [11,14,[22][23][24] or a narrow range of values, e.g., [25], which does not match the actual distribution of levels observed in real life and artificially advantages learning-based methods which may overfit those levels. Even when the distortion levels are realistic, there may still exist some acoustic mismatch, due to recording speech in a different place than noise and reverberation, e.g., [26,27].…”
Section: Introductionmentioning
confidence: 99%