Interspeech 2021 2021
DOI: 10.21437/interspeech.2021-1755
|View full text |Cite
|
Sign up to set email alerts
|

The Zero Resource Speech Challenge 2021: Spoken Language Modelling

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
4
1

Relationship

5
4

Authors

Journals

citations
Cited by 23 publications
(15 citation statements)
references
References 0 publications
0
15
0
Order By: Relevance
“…In machine learning, CPC has been shown to be powerful in a wide variety of modalities ranging from audio and images to natural language and reinforcement learning (25). In the ZeroSpeech 2021 international challenge on unsupervised representation learning, CPC was the best system to develop a perceptual space that accurately discriminate speech sounds (26). The key idea behind CPC is to predict the future states of a sequence given its past context.…”
Section: R a F T Attunement Specificallymentioning
confidence: 99%
“…In machine learning, CPC has been shown to be powerful in a wide variety of modalities ranging from audio and images to natural language and reinforcement learning (25). In the ZeroSpeech 2021 international challenge on unsupervised representation learning, CPC was the best system to develop a perceptual space that accurately discriminate speech sounds (26). The key idea behind CPC is to predict the future states of a sequence given its past context.…”
Section: R a F T Attunement Specificallymentioning
confidence: 99%
“…4) Results: The first round of submissions was documented in 2021 [14]; the best-performing systems were variants of our baseline system. A second round was opened as a NeurIPS 2021 challenge, including a visually-grounded training option.…”
Section: Task 4: Spoken Lmmentioning
confidence: 99%
“…Spoken Language Modeling Following the huge success of language models on text data (Devlin et al, 2019;Radford et al, 2019;Brown et al, 2020), the Zero Resource Speech Challenge 2021 (Nguyen et al, 2020;Dunbar et al, 2021) opens up new possibilities for learning high-level language properties from raw audio without any text labels. They introduced 4 zero-shot evaluation metrics at different linguistic levels (phonetic, lexical, syntactic, semantic), along with composite baseline systems consisting of an acoustic discretization module (CPC+k-means) followed by a language model (BERT or LSTM) on the discretized units.…”
Section: Related Workmentioning
confidence: 99%
“…The approach in this work relies on transforming the audio into a sequence of discrete units (or pseudotext) and training a language model on the pseudotext. The trained models displayed better-thanchance performances on nearly all zero-shot evalu- ation metrics of the Zero Resource Challenge 2021 (Nguyen et al, 2020;Dunbar et al, 2021) on different linguistic levels. However, this paradigm creates a discrete bottleneck between a speech encoder and a language model which could be a potential source of error, and in addition requires multiple training phases (learning a an acoustic representation, clustering it, and learning a language model).…”
Section: Introductionmentioning
confidence: 99%