2005
DOI: 10.1109/tsa.2004.840940
|View full text |Cite
|
Sign up to set email alerts
|

Eigenvoice modeling with sparse training data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
289
0
3

Year Published

2009
2009
2017
2017

Publication Types

Select...
3
3
3

Relationship

0
9

Authors

Journals

citations
Cited by 421 publications
(295 citation statements)
references
References 17 publications
1
289
0
3
Order By: Relevance
“…Compared to traditional i-vector extraction [4], the S (c) filtering mechanism ensure that only those key components representing lexical information of utterance involve in adaptation. We use α = R/C to denote the ratio of the number of key components to the number of total components.…”
Section: Text-dependent I-vector Extractionmentioning
confidence: 99%
See 1 more Smart Citation
“…Compared to traditional i-vector extraction [4], the S (c) filtering mechanism ensure that only those key components representing lexical information of utterance involve in adaptation. We use α = R/C to denote the ratio of the number of key components to the number of total components.…”
Section: Text-dependent I-vector Extractionmentioning
confidence: 99%
“…An improved cosine distance kernel is constructed in Sect. 4. Experiments are conducted and several results are presented and analysed in Sect.…”
Section: Introductionmentioning
confidence: 99%
“…This algorithm is similar to the one used to estimate the identity (betweenclass) subspace in JFA [32], with one major difference: while JFA jointly consider the samples coming from a given subject, TV treats them as if they have been produced by different identities, which is an advantage when large unlabelled training datasets are used. In addition, the extraction of i-vectors requires the estimation of a covariance matrix Σ T to model the residual variability that is not captured by the subspace T .…”
Section: Total Variability Modellingmentioning
confidence: 99%
“…There have been reported work on speech [9] and speaker recognition [10] where researchers leverage on existing speech corpora from non-target speakers as the prior knowledge to improve their systems' performance. Following the same idea, eigenvoice-based conversion [11], and tensor representation of speaker space [12] are examples of similar successful attempts in voice conversion.…”
Section: Introductionmentioning
confidence: 99%