The Speaker and Language Recognition Workshop (Odyssey 2018) 2018
DOI: 10.21437/odyssey.2018-50
|View full text |Cite
|
Sign up to set email alerts
|

Supervector Compression Strategies to Speed up I-Vector System Development

Abstract: The front-end factor analysis (FEFA), an extension of principal component analysis (PPCA) tailored to be used with Gaussian mixture models (GMMs), is currently the prevalent approach to extract compact utterance-level features (i-vectors) for automatic speaker verification (ASV) systems. Little research has been conducted comparing FEFA to the conventional PPCA applied to maximum a posteriori (MAP) adapted GMM supervectors. We study several alternative methods, including PPCA, factor analysis (FA), and two sup… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…The UBM is a 1024-component Gaussian mixture model (GMM) [10], which is used to compute sufficient statistics for ivector extraction. We compute 800-dimensional i-vectors by compressing mean supervectors of maximum a posteriori (MAP) adapted GMMs using probabilistic principal component analysis (PPCA) as described in [5]. This is a (speed-wise) high-performing alternative to the stardard i-vector extraction that is traditionally done via front-end factor analysis [11,12].…”
Section: Speaker Identification System Descriptionmentioning
confidence: 99%
See 2 more Smart Citations
“…The UBM is a 1024-component Gaussian mixture model (GMM) [10], which is used to compute sufficient statistics for ivector extraction. We compute 800-dimensional i-vectors by compressing mean supervectors of maximum a posteriori (MAP) adapted GMMs using probabilistic principal component analysis (PPCA) as described in [5]. This is a (speed-wise) high-performing alternative to the stardard i-vector extraction that is traditionally done via front-end factor analysis [11,12].…”
Section: Speaker Identification System Descriptionmentioning
confidence: 99%
“…The i-vector extraction using PPCA is simply a matter of compressing 61440-dimensional GMM-supervector to 800-dimensional space using a precomputed projection matrix. Note that the traditional approach for i-vector extraction would, in addition, require inverting an 800 × 800 posterior covariance matrix [14,5].…”
Section: Without Replay Channelmentioning
confidence: 99%
See 1 more Smart Citation