2015
DOI: 10.1109/taslp.2014.2387382
|View full text |Cite
|
Sign up to set email alerts
|

Acoustic Segment Modeling with Spectral Clustering Methods

Abstract: This paper presents a study of spectral clustering-based approaches to acoustic segment modeling (ASM). ASM aims at finding the underlying phoneme-like speech units and building the corresponding acoustic models in the unsupervised setting, where no prior linguistic knowledge and manual transcriptions are available. A typical ASM process involves three stages, namely initial segmentation, segment labeling, and iterative modeling. This work focuses on the improvement of segment labeling. Specifically, we use po… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
42
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 53 publications
(43 citation statements)
references
References 54 publications
(88 reference statements)
1
42
0
Order By: Relevance
“…To overcome these limitations, model-based approaches have been investigated [24], [33], [34]. These approaches primarily rely on acoustic units discovered in an unsupervised manner.…”
Section: Prior Workmentioning
confidence: 99%
“…To overcome these limitations, model-based approaches have been investigated [24], [33], [34]. These approaches primarily rely on acoustic units discovered in an unsupervised manner.…”
Section: Prior Workmentioning
confidence: 99%
“…Open-source tools [13] are used to train FHVAEs. 1 "speakers-R/-L" denotes speakers with rich/limited speech data. In our preliminary experiments, the ABX performance of z1 was found to be sensitive to the input segment length l. This could be explained as: a too large l would reduce the capability of z1 in modeling linguistic content at subword level; a too small l would restrict the FHVAE from capturing sufficient temporal dependencies which are essential in modeling speech.…”
Section: Fhvae Setup and Parameter Tuningmentioning
confidence: 99%
“…UAM is a challenging problem with significant practical impact in speech as well as linguistics and cognitive science communities. It has been studied in applications such as ASR for low-resource languages [1], language identification [2] and query-by-example spoken term detection [3]. This problem is also relevant to endangered language protection [4] and understanding infants' language acquisition mechanism [5].…”
Section: Introductionmentioning
confidence: 99%
“…Unsupervised spoken term detection techniques, which aim at automatically discovering acoustic patterns (e.g., for training acoustic models) for languages for which manual transcriptions and linguistic knowledge are scarce, have been also investigated [34,35]. These techniques can also be employed for building language-independent QbE STD systems, since prior knowledge of the language is not necessary.…”
Section: Introductionmentioning
confidence: 99%