2021
DOI: 10.48550/arxiv.2104.10832
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Building Bilingual and Code-Switched Voice Conversion with Limited Training Data Using Embedding Consistency Loss

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 22 publications
0
2
0
Order By: Relevance
“…According to the experiment results in [15], although the pre-trained content encoder can provide precise linguistic information, it still contains speaker information. Thus we apply the speaker information remover as shown in figure 1 to remove the speaker information, and ideally, we can obtain purified content information.…”
Section: Supervision On Intermediate Representationmentioning
confidence: 99%
See 1 more Smart Citation
“…According to the experiment results in [15], although the pre-trained content encoder can provide precise linguistic information, it still contains speaker information. Thus we apply the speaker information remover as shown in figure 1 to remove the speaker information, and ideally, we can obtain purified content information.…”
Section: Supervision On Intermediate Representationmentioning
confidence: 99%
“…Regarding the linguistic content, a pre-trained acoustic model is applied to extract the linguistic feature. Experiments in [15] show that there is speaker information persisting in the linguistic information extracted by the acoustic recognition model. Therefore, we manage to eliminate the residual speaker information and get purified content information.…”
Section: Introductionmentioning
confidence: 98%