2017
DOI: 10.1109/jstsp.2017.2762080
|View full text |Cite
|
Sign up to set email alerts
|

Joint Learning of Distance Metric and Query Model for Posteriorgram-Based Keyword Search

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 14 publications
(9 citation statements)
references
References 29 publications
0
8
0
Order By: Relevance
“…Figure 3 shows the impact of duration statistics on query construction for the baseline system (note that this figure, unlike Figure 2, includes all terms, and not just OOV terms). The method described in [14] requires that the query sequence be generated by repeating each phoneme representation before concatenating them. This ensures that the constructed queries are approximately as long as they are in the document.…”
Section: Baseline Systemmentioning
confidence: 99%
See 3 more Smart Citations
“…Figure 3 shows the impact of duration statistics on query construction for the baseline system (note that this figure, unlike Figure 2, includes all terms, and not just OOV terms). The method described in [14] requires that the query sequence be generated by repeating each phoneme representation before concatenating them. This ensures that the constructed queries are approximately as long as they are in the document.…”
Section: Baseline Systemmentioning
confidence: 99%
“…posteriorgrams from a DNN), learning a mapping from phonemes to vectors in that space, and then aligning such representations so that, in effect, the search becomes QBE. While this method outperforms proxy keywords [14] and, as we'll show, subword OOV retrieval, its practicality is impeded by the computational cost of the DTW algorithm, which is linear (memory and CPU) in the lengths of both the query and document sequences.…”
Section: Introductionmentioning
confidence: 96%
See 2 more Smart Citations
“…In this paper, we propose a Siamese neural network‐based SML methodology to facilitate joint learning of a similarity metric that exploits multiple feature representations of the input images. The key contributions of the paper are as follows: • The sigma distance that was recently proposed for speech features in a keyword search task [29], is utilised for face images and compared with the other DML/SML techniques in the literature. • Two COSiM methodologies are proposed and discussed both mathematically and empirically. For experiments, two image representations: (i) scale‐invariant feature transform (SIFT) and (ii) local binary pattern (LBP) are combined and the SML networks are trained jointly to investigate as a means of early fusion.…”
Section: Introductionmentioning
confidence: 99%