Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1308
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Word Segmentation from Speech with Attention

Abstract: We present a first attempt to perform attentional word segmentation directly from the speech signal, with the final goal to automatically identify lexical units in a low-resource, unwritten language (UL). Our methodology assumes a pairing between recordings in the UL with translations in a well-resourced language. It uses Acoustic Unit Discovery (AUD) to convert speech into a sequence of pseudo-phones that is segmented using neural soft-alignments produced by a neural machine translation model. Evaluation uses… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 23 publications
(32 citation statements)
references
References 21 publications
(52 reference statements)
0
31
0
1
Order By: Relevance
“…As in language documentation scenarios available corpora usually contain speech in the language to document aligned with translations in a well-resourced language, Godard et al [5] introduced a pipeline for performing Unsupervised Word Segmentation (UWS) from speech. The system outputs timestamps delimiting stretches of speech, associated with class labels, corresponding to real words in the language.…”
Section: Unsupervised Word Segmentation From Speechmentioning
confidence: 99%
See 2 more Smart Citations
“…As in language documentation scenarios available corpora usually contain speech in the language to document aligned with translations in a well-resourced language, Godard et al [5] introduced a pipeline for performing Unsupervised Word Segmentation (UWS) from speech. The system outputs timestamps delimiting stretches of speech, associated with class labels, corresponding to real words in the language.…”
Section: Unsupervised Word Segmentation From Speechmentioning
confidence: 99%
“…For each S2S architecture, and each of the three corpora, we train five models (runs) with different initialization seeds. 3 Before segmenting, we average the produced matrices from the five different runs as in [5]. Evaluation is done in a bilingual segmentation condition that corresponds to the real UWS task.…”
Section: Comparing S2s Architecturesmentioning
confidence: 99%
See 1 more Smart Citation
“…Aside from directly improving performance on various tasks, attention Luong et al, 2015) has proven to be extremely useful when used indirectly in a wide variety of other ways (for example, for segmentation (Tang and Yang, 2018) and unsupervised speechto-text alignment (Boito et al, 2017;Godard et al, 2018)). In addition, using attention-based models for object segmentation in a weakly supervised setting has been well explored in the vision domain (Teh et al, 2016;Zhang et al, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…The same task has been attempted [14] using NMT with attention [15] to align speech or phone sequences to the word labels of the high-resourced language; modifications of the attention mechanism to ensure coverage and richer context. If the true phone sequence in the under-resourced language is unknown, pseudo-phone labels generated by an unsupervised non-parametric Bayesian model [6] can be used as input to the NMT [16].…”
Section: Introductionmentioning
confidence: 99%