Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2021
DOI: 10.18653/v1/2021.naacl-main.377
|View full text |Cite
|
Sign up to set email alerts
|

Learning Paralinguistic Features from Audiobooks through Style Voice Conversion

Abstract: Paralinguistics, the non-lexical components of speech, play a crucial role in human-human interaction. Models designed to recognize paralinguistic information, particularly speech emotion and style, are difficult to train because of the limited labeled datasets available. In this work, we present a new framework that enables a neural network to learn to extract paralinguistic attributes from speech using data that are not annotated for emotion. We assess the utility of the learned embeddings on the downstream … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 27 publications
0
1
0
Order By: Relevance
“…Aldeneh et al [11] proposed a framework to learn to extract paralinguistic embedding. The authors showed that converting synthetic-neutral speech to expressive speech based on that embedding improved the results from acoustic features and other evaluated embeddings.…”
Section: Related Work: Ssl and Sermentioning
confidence: 99%
“…Aldeneh et al [11] proposed a framework to learn to extract paralinguistic embedding. The authors showed that converting synthetic-neutral speech to expressive speech based on that embedding improved the results from acoustic features and other evaluated embeddings.…”
Section: Related Work: Ssl and Sermentioning
confidence: 99%