2012 IEEE Spoken Language Technology Workshop (SLT) 2012
DOI: 10.1109/slt.2012.6424223
|View full text |Cite
|
Sign up to set email alerts
|

Incorporating syllable duration into line-detection-based spoken term detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2015
2015
2016
2016

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 3 publications
0
3
0
Order By: Relevance
“…Duration related cues have been shown useful, such as the duration of the signal segments hypothesized to be the query divided by the number of syllables or phonemes in the query (or the speaking rate), and the average duration of the same syllables or phonemes in the target spoken archive [149], [150], [159]- [162]. This is because extremely high or low speaking rate or abnormal phoneme and syllable durations may imply that the hypothesized signal segment is a false alarm.…”
Section: B Incorporating Prosodic Cuesmentioning
confidence: 99%
“…Duration related cues have been shown useful, such as the duration of the signal segments hypothesized to be the query divided by the number of syllables or phonemes in the query (or the speaking rate), and the average duration of the same syllables or phonemes in the target spoken archive [149], [150], [159]- [162]. This is because extremely high or low speaking rate or abnormal phoneme and syllable durations may imply that the hypothesized signal segment is a false alarm.…”
Section: B Incorporating Prosodic Cuesmentioning
confidence: 99%
“…Segmenting speech into syllables helps in many speech applications such as speech recognition, synthesis and spoken term detection [1]. Detecting the syllable nuclei and the syllable boundary has been a challenging task.…”
Section: Introductionmentioning
confidence: 99%
“…However, the subword-based approach has the unique advantage that it can detect terms that consist of words that are not in the recognizer's vocabulary -out-of-vocabulary (OOV) terms -whereas the word-based approach can only detect in-vocabulary (INV) terms. Several subword unit types have been employed in the subwordbased approach, including word fragments [46], particles [47,48], acoustic words [49], graphones [6,7], multigrams [9,50], syllables [51][52][53], and graphemes [54], although phonemes are the most commonly used due to their simplicity and natural relationship with spoken languages [41,[55][56][57][58][59]. In order to exploit the relative advantages of the word and phoneme-based approaches, it has been proposed to combine these two approaches by using the word-based approach to detect INV terms and the subword-based approach to detect OOV terms, e.g., [41,56,[60][61][62][63][64].…”
Section: Introduction To Spoken Term Detection Technologymentioning
confidence: 99%