2014
DOI: 10.1109/taslp.2014.2311322
|View full text |Cite
|
Sign up to set email alerts
|

Query-by-Example Spoken Term Detection using Frequency Domain Linear Prediction and Non-Segmental Dynamic Time Warping

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
21
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 53 publications
(22 citation statements)
references
References 25 publications
1
21
0
Order By: Relevance
“…In addition, performance using the proposed approach is better than the Gaussian posteriorgram. This finding matches a previous study reported in [6]. This might be because of the increasing number of clusters better represents the speech signal at the frame-level.…”
Section: Number Of Gaussiansupporting
confidence: 92%
See 2 more Smart Citations
“…In addition, performance using the proposed approach is better than the Gaussian posteriorgram. This finding matches a previous study reported in [6]. This might be because of the increasing number of clusters better represents the speech signal at the frame-level.…”
Section: Number Of Gaussiansupporting
confidence: 92%
“…The number of Gaussian components in Gaussian posteriorgram plays an important role in QbE-STD tasks [4], [6]. In this Section, we investigate the effect of the number of mixture components used in VTLN warping factor estimation on QbE-STD tasks.…”
Section: Number Of Gaussianmentioning
confidence: 99%
See 1 more Smart Citation
“…DTW is employed [12] for this query and reference file at syllable level. From the figure 1, it can be observed that there is match from syllable number 14 to 16 of reference speech file to that of query file.…”
Section: Spoken Term Detectionmentioning
confidence: 99%
“…Using speech queries offers a big advantage for devices with limited textbased capabilities, which can be effectively used under the QbE STD paradigm. Other advantage is that QbE STD can be employed for building language-independent STD systems [7][8][9][10], since prior knowledge of the language involved in the speech data is not necessary.…”
Section: Introductionmentioning
confidence: 99%