2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2017
DOI: 10.1109/asru.2017.8268953
|View full text |Cite
|
Sign up to set email alerts
|

The zero resource speech challenge 2017

Abstract: We describe a new challenge aimed at discovering subword and word units from raw speech. This challenge is the followup to the Zero Resource Speech Challenge 2015. It aims at constructing systems that generalize across languages and adapt to new speakers. The design features and evaluation metrics of the challenge are presented and the results of seventeen models are discussed.Index Terms-zero resource speech technology, subword modeling, acoustic unit discovery, unsupervised term discovery

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
221
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 152 publications
(221 citation statements)
references
References 30 publications
(38 reference statements)
0
221
0
Order By: Relevance
“…For g c , we use five convolutional layers with strides [5, 4, 2, 2, 2], filter-sizes [10,8,4,4,4] and 256 hidden units with ReLU activations. Besides, the features are normalized channelwise between each convolution.…”
Section: S212 Architecture Detailsmentioning
confidence: 99%
See 1 more Smart Citation
“…For g c , we use five convolutional layers with strides [5, 4, 2, 2, 2], filter-sizes [10,8,4,4,4] and 256 hidden units with ReLU activations. Besides, the features are normalized channelwise between each convolution.…”
Section: S212 Architecture Detailsmentioning
confidence: 99%
“…languages [5,6] or pretraining using unsupervised objectives [7,8]. At the extreme of this continuum, zero resource ASR discovers its own units from raw speech [9,10,11]. Despite many interesting results, the field lacks a common benchmark (datasets, evaluations, or baselines) for comparing ideas and results across these settings.…”
Section: Introductionmentioning
confidence: 99%
“…For many low-resource languages, however, it is difficult or impossible to collect such annotated resources. Motivated by the observation that infants acquire language without hard supervision, studies into "zero-resource" speech technology have started to develop unsupervised systems that can learn directly from unlabelled speech audio [1][2][3].…”
Section: Introductionmentioning
confidence: 99%
“…Even for resource-rich languages, preparing transcriptions for available training data is a time-consuming task that involves considerable human effort. For many languages in the world, very little or no transcribed speech is available [6], and conventional acoustic modeling techniques are simply not applicable. S. Feng Unsupervised speech modeling is the task of building subword or word-level AMs, when only untranscribed speech are available for training [7]- [9].…”
Section: Introductionmentioning
confidence: 99%