ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
DOI: 10.1109/icassp43922.2022.9747345
|View full text |Cite
|
Sign up to set email alerts
|

Speaker Generation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(10 citation statements)
references
References 13 publications
0
10
0
Order By: Relevance
“…As is often shown in the literature [3], gender is one of the dominant sources of variation in speech. The strength of each dimension's association with gender in the embedded space can be found using their correlation ratio, η, calculated by dividing the weighted variance of the mean of each category (male/female) by the variance of all samples.…”
Section: Exploring the Speaker Embeddings Spacementioning
confidence: 85%
See 1 more Smart Citation
“…As is often shown in the literature [3], gender is one of the dominant sources of variation in speech. The strength of each dimension's association with gender in the embedded space can be found using their correlation ratio, η, calculated by dividing the weighted variance of the mean of each category (male/female) by the variance of all samples.…”
Section: Exploring the Speaker Embeddings Spacementioning
confidence: 85%
“…The speaker generation task has been introduced very recently by Stanton et al [3]. In their work, they train a multi-speaker Tacotron model by using learnable speaker embeddings and create a speaker embedding prior to model the distribution over the speaker embedding space.…”
Section: Related Workmentioning
confidence: 99%
“…We adapt the metric s2t-same from [36] which measures how similar synthesized audio from a synthesized speaker is to ground truth audio from the same speaker. While in the original context, this metric was used for speakers of the training dataset, here we use it to measure the speaker fidelity for unseen speakers.…”
Section: Objective Evaluationmentioning
confidence: 99%
“…Inspired by the recently introduced task of speaker generation [62], we introduce a methodology to use VC models for speaker anonymization without the need for specifying a target speaker. Although we discuss this approach in the context of LVC-VC, it can feasibly be used to extend the capabilities of any VC model that incorporates a speaker encoder.…”
Section: E Extension: Un-targeted Speaker Anonymizationmentioning
confidence: 99%