1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258) 1999
DOI: 10.1109/icassp.1999.758120
|View full text |Cite
|
Sign up to set email alerts
|

Signal modeling for isolated word recognition

Abstract: This paper presents speech signal modeling techniques which are well suited to high performance and robust isolated word recognition. Speech is encoded by a discrete cosine transform of its spectra, after several preprocessing steps. Temporal information is then also explicitly encoded into the feature set. We present a new technique for incorporating this temporal information as a function of temporal position within each word. We tested features computed with this method using an alphabet recognition task ba… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2001
2001
2012
2012

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 10 publications
(9 citation statements)
references
References 4 publications
0
9
0
Order By: Relevance
“…The block features were recomputed every 10 ms. No manual segmentation or phonetic labeling was required or used. The primary modification, relative to [8] and [9], is that the block length was varied at both ends of each analyzed utterance, rather than only for the beginning section. See Fig.1 for an illustration of this variable block length method.…”
Section: ) Variable Block Length Methodsmentioning
confidence: 99%
“…The block features were recomputed every 10 ms. No manual segmentation or phonetic labeling was required or used. The primary modification, relative to [8] and [9], is that the block length was varied at both ends of each analyzed utterance, rather than only for the beginning section. See Fig.1 for an illustration of this variable block length method.…”
Section: ) Variable Block Length Methodsmentioning
confidence: 99%
“…For both training and testing data, the modified Discrete Cosine Transformation Coefficients (DCTC) and Discrete Cosine Series Coefficients (DCSC) (Zahorian et al 1991;Zahorian et al, 1997;Zahorian et al, 2002;Karnjanadecha & Zahorian, 1999) were extracted as original features. The modified DCTC is used for representing speech spectra, and the modified DCSC is used to represent spectral trajectories.…”
Section: Dctc/dcsc Speech Featuresmentioning
confidence: 99%
“…In conclusion, we can say that some of the disadvantage of phoneme based recognizers as in [1] when compared to word based recognizer is complexity of the system and the word transcription must be known [2]. Also, from our review, many of these testing and experiments were done using HMMs and modified techniques.…”
Section: Some Existing Workmentioning
confidence: 99%
“…Signal modeling for high performance and robust isolated word recognition were proposed in [2,8]. The authors proposed a new technique for incorporating temporal and spectral feature within each word.…”
Section: Some Existing Workmentioning
confidence: 99%
See 1 more Smart Citation