12th ISCA Speech Synthesis Workshop (SSW2023) 2023
DOI: 10.21437/ssw.2023-21
|View full text |Cite
|
Sign up to set email alerts
|

An analysis on the effects of speaker embedding choice in non auto-regressive TTS

Adriana Stan,
Johannah O'Mahony

Abstract: In this paper we introduce a first attempt on understanding how a non-autoregressive factorised multi-speaker speech synthesis architecture exploits the information present in different speaker embedding sets. We analyse if jointly learning the representations, and initialising them from pretrained models determine any quality improvements for target speaker identities. In a separate analysis, we investigate how the different sets of embeddings impact the network's core speech abstraction (i.e. zero conditione… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 11 publications
(17 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?