Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-2224
|View full text |Cite
|
Sign up to set email alerts
|

Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery

Abstract: This work tackles the problem of learning a set of language specific acoustic units from unlabeled speech recordings given a set of labeled recordings from other languages. Our approach may be described by the following two steps procedure: first the model learns the notion of acoustic units from the labelled data and then the model uses its knowledge to find new acoustic units on the target language. We implement this process with the Bayesian Subspace Hidden Markov Model (SHMM), a model akin to the Subspace … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
23
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
2
1

Relationship

3
5

Authors

Journals

citations
Cited by 18 publications
(25 citation statements)
references
References 11 publications
2
23
0
Order By: Relevance
“…The DNN AM in our system is trained and evaluated on the entire Mboshi dataset without training-test partition, as we are tackling an unsupervised learning problem. This is also consistent with the studies [11,19].…”
Section: Databasessupporting
confidence: 94%
See 1 more Smart Citation
“…The DNN AM in our system is trained and evaluated on the entire Mboshi dataset without training-test partition, as we are tackling an unsupervised learning problem. This is also consistent with the studies [11,19].…”
Section: Databasessupporting
confidence: 94%
“…There are two mainstream research strands in unsupervised acoustic modeling. The first strand, acoustic unit discovery (AUD) [6,10], formulates the problem as discovering a finite set of phone-like acoustic units [6,7,11]. The second strand, unsupervised subword modeling (USM) [9,12], formulates the problem as learning a frame-level feature representation that can Code: https://github.com/syfengcuhk/mboshi.…”
Section: Introductionmentioning
confidence: 99%
“…This study is inspired by recent work showing the benefit of using multilingual bottleneck features as frame-level representations for zero-resource languages [23][24][25][26]. In [27], multilingual data were used in a similar way for discovering acoustic units. As in those studies, our findings show the advantage of learning from labelled data in well-resourced languages when processing an unseen low-resource languagehere at the word rather than subword level.…”
Section: Introductionmentioning
confidence: 99%
“…Vague priors are easy to define but they fail to provide a reasonable selection of "good" candidates. Recent works [17], [18] have proposed to remedy this shortcoming by introducing Fig. 2: Illustration of the concept of phonetic subspace: each phone is represented as a vector η encoding the parameters of a probabilistic model (an HMM in this example).…”
Section: B Phonetic Subspacementioning
confidence: 99%