2019
DOI: 10.48550/arxiv.1904.03876
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“…A model that can do this can also be used for unsupervised phone segmentation, predicting phone boundaries. Several acoustic unit discovery and unsupervised phone segmentation models have been proposed [22]- [27]. Recently, large gains have been achieved by combining selfsupervised neural networks with a clustering component.…”
Section: Dpdp For Acoustic Unit Discoverymentioning
confidence: 99%
“…A model that can do this can also be used for unsupervised phone segmentation, predicting phone boundaries. Several acoustic unit discovery and unsupervised phone segmentation models have been proposed [22]- [27]. Recently, large gains have been achieved by combining selfsupervised neural networks with a clustering component.…”
Section: Dpdp For Acoustic Unit Discoverymentioning
confidence: 99%
“…Most importantly, our work is inspired by studies showing the benefit of using multilingual bottleneck features as framelevel representations for zero-resource languages [38]- [41]: a frame-level network is trained jointly on several well-resourced languages (normally to predict context-dependent triphone HMM states) and is then applied to an unseen language. In [42], multilingual data was also used for discovering acoustic units. As in these studies, our findings show the advantage of learning from labelled data in well-resourced languages when processing an unseen low-resource language-here at the word rather than subword level.…”
Section: Related Workmentioning
confidence: 99%
“…A zero-resource language is one for which no transcribed speech resources are available for developing speech systems [1,2]. Although conventional speech recognition is not possible for such languages, researchers have shown how speech search [3][4][5], discovery [6][7][8][9], and segmentation and clustering [10][11][12] applications can be developed without any labelled speech audio. In many of these applications, a metric is required for comparing speech segments of different durations.…”
Section: Introductionmentioning
confidence: 99%