2022
DOI: 10.48550/arxiv.2203.06849
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 0 publications
0
4
0
Order By: Relevance
“…Self-supervised representations : In our experiments, we employ HuBERT [27] , which shows promising results over the SUPERB benchmark [10]. 6 To fully explore the potential of HuBERT, we select the HuBERT-large model pre-trained over 60k hours of LibriLight [46,47].…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Self-supervised representations : In our experiments, we employ HuBERT [27] , which shows promising results over the SUPERB benchmark [10]. 6 To fully explore the potential of HuBERT, we select the HuBERT-large model pre-trained over 60k hours of LibriLight [46,47].…”
Section: Methodsmentioning
confidence: 99%
“…The key idea is that unlabeled data contains valuable information and is far more abundant than labeled data in any domain. This paradigm leads to general-purpose speech representation, suitable for speech processing tasks [10].…”
Section: Speech Representationsmentioning
confidence: 99%
See 1 more Smart Citation
“…As mentioned earlier, there are many datasets available for English, such as Librispeech (Panayotov et al 2015) and Common Voice (Ardila et al 2020) for ASR, Vox-Celeb1 (Nagrani, Chung, and Zisserman 2017) and Vox-Celeb2 (Chung, Nagrani, and Zisserman 2018) for speaker recognition. More recently, SUPERB (Yang et al 2021b) and SUPERB-SG (Tsai et al 2022) have been released and contain various speech language understanding and synthesis tasks. In contrast, there are very few datasets for Indian languages as summarised in Table 1.…”
Section: Related Workmentioning
confidence: 99%
“…Some recent research has explored large-scale pre-training for speech synthesis tasks. For example, the SUPERB-SG [96] benchmark was introduced to evaluate pre-trained models on various tasks including speech enhancement and voice conversion. Prior work on pre-training generative models of speech has focused on learning representations for downstream classification tasks, rather than synthesis [97].…”
Section: Large-scale Pre-training With Speech Datamentioning
confidence: 99%