Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-2904
|View full text |Cite
|
Sign up to set email alerts
|

The Zero Resource Speech Challenge 2019: TTS Without T

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
121
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 107 publications
(122 citation statements)
references
References 0 publications
1
121
0
Order By: Relevance
“…languages [5,6] or pretraining using unsupervised objectives [7,8]. At the extreme of this continuum, zero resource ASR discovers its own units from raw speech [9,10,11]. Despite many interesting results, the field lacks a common benchmark (datasets, evaluations, or baselines) for comparing ideas and results across these settings.…”
Section: Introductionmentioning
confidence: 99%
“…languages [5,6] or pretraining using unsupervised objectives [7,8]. At the extreme of this continuum, zero resource ASR discovers its own units from raw speech [9,10,11]. Despite many interesting results, the field lacks a common benchmark (datasets, evaluations, or baselines) for comparing ideas and results across these settings.…”
Section: Introductionmentioning
confidence: 99%
“…Features should ideally disregard irrelevant information (such as speaker and gender), while capturing linguistically meaningful contrasts (such as phone or word categories). Several different unsupervised frame-level acoustic feature learning methods have been developed over the last few years [6]- [12], with neural networks being used in a number of studies [13]- [17].…”
Section: Introductionmentioning
confidence: 99%
“…Recent work has considered unsupervised learning for a variety of speech tasks. Some of this work is explicitly aimed at a "zero-speech" setting where no or almost no labeled data is available at all (e.g., [14,15,16,17]), where the focus is to learn phonetic or word-like units, or representations that can distinguish among such units. Other work considers a variety of downstream supervised tasks, and some focuses explicitly on learning representations that generalize across tasks or across very different domains [6,7,18,19].…”
Section: Related Workmentioning
confidence: 99%