2013 IEEE International Conference on Acoustics, Speech and Signal Processing 2013
DOI: 10.1109/icassp.2013.6639239
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised discovery of linguistic structure including two-level acoustic patterns using three cascaded stages of iterative optimization

Abstract: Techniques for unsupervised discovery of acoustic patterns are getting increasingly attractive, because huge quantities of speech data are becoming available but manual annotations remain hard to acquire. In this paper, we propose an approach for unsupervised discovery of linguistic structure for the target spoken language given raw speech data. This linguistic structure includes two-level (subwordlike and word-like) acoustic patterns, the lexicon of word-like patterns in terms of subword-like patterns and the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
56
0

Year Published

2013
2013
2018
2018

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 24 publications
(56 citation statements)
references
References 32 publications
0
56
0
Order By: Relevance
“…Take inner product as the similarity measure, (24) Similar to (15), the computation of Laplacian matrices in (18) becomes, (25) where, (26) with . Then problem (20) is rewritten as,…”
Section: Multiview Segment Clusteringmentioning
confidence: 99%
See 2 more Smart Citations
“…Take inner product as the similarity measure, (24) Similar to (15), the computation of Laplacian matrices in (18) becomes, (25) where, (26) with . Then problem (20) is rewritten as,…”
Section: Multiview Segment Clusteringmentioning
confidence: 99%
“…With this approach, the number of speech units can be estimated automatically. In [26], a three-stage approach involving word-level pattern construction and word-level decoding was proposed. In [6] [27], the problem of unsupervised acoustic modeling was tackled by first discovering large-size units (e.g., words), and performing Gaussian component clustering with top-down constraints.…”
Section: A Unsupervised Acoustic Modeling Techniquesmentioning
confidence: 99%
See 1 more Smart Citation
“…Alternately, the lexicon development process is weakly-supervised similar to acoustic model development in an ASR system. More recently, in the context of "zero-resourced" ASR system development, there are efforts towards developing methods that are fully unsupervised (Chung et al, 2013;Lee et al, 2015). Such methods are at very early stages and are out of the scope of this paper.…”
Section: Literature Survey On Aswu Derivation and Pronunciation Genermentioning
confidence: 99%
“…which is actually parallel to (11). The superscript n indicates the n-th training utterance and d indicates the number of dimensions of x n t .…”
Section: Baseline: Recurrent Predictor Modelmentioning
confidence: 99%