2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2018
DOI: 10.23919/apsipa.2018.8659619
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Pattern Discovery from Thematic Speech Archives Based on Multilingual Bottleneck Features

Abstract: The present study tackles the problem of automatically discovering spoken keywords from untranscribed audio archives without requiring word-byword speech transcription by automatic speech recognition (ASR) technology. The problem is of practical significance in many applications of speech analytics, including those concerning low-resource languages, and large amount of multilingual and multi-genre data. We propose a twostage approach, which comprises unsupervised acoustic modeling and decoding, followed by pat… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2019
2019
2019
2019

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 39 publications
0
1
0
Order By: Relevance
“…To better retain temporal dependency in speech, frame clustering can be embodied in segment level. Initial segmentation of speech utterances could be obtained by hierarchical agglomerative clustering [33], or using language-mismatched phone recognizers [34], [35]. Subsequently a fixed-length feature vector is derived to represent each speech segment.…”
Section: B Unsupervised Subword Modeling Without Using Dnnmentioning
confidence: 99%
“…To better retain temporal dependency in speech, frame clustering can be embodied in segment level. Initial segmentation of speech utterances could be obtained by hierarchical agglomerative clustering [33], or using language-mismatched phone recognizers [34], [35]. Subsequently a fixed-length feature vector is derived to represent each speech segment.…”
Section: B Unsupervised Subword Modeling Without Using Dnnmentioning
confidence: 99%