2006 IEEE International SOC Conference 2006
DOI: 10.1109/socc.2006.283836
|View full text |Cite
|
Sign up to set email alerts
|

Architecture for Low Power Large Vocabulary Speech Recognition

Abstract: This paper proposes an architecture for real-time large vocabulary speech recognition on a mobile embedded device. The speech recognition system is based on Hidden Markov Model (HMM), which involves complex mathematical operations such as probability estimation and Viterbi decoding. This computational nature makes it power hungry and realtime recognition is not achieved by porting software solutions on embedded device. Our system architecture has a low power embedded processor and dedicated ASIC units for comp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2008
2008
2014
2014

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 7 publications
0
2
0
Order By: Relevance
“…We chose MFCC in this paper according to the demand of recognition performance and limit of hardware resource. Storage space and recognition time are increased along with the increase of the characteristic dimension, therefore needing to choose the proper characteristic parameters for the sake of good real-time performance in the embedded recognition system, rather than using 39 dimensions feather composed by MFCC [14], [15], first-order difference, second-order difference, normalized transient energy and differential power. We identified the contribution of feature components to recognition performance, and chose the 27 dimensions feather components with biggest contribution, effectively reducing the SRAM area resources consumption by storage characteristic parameters.…”
Section: Implementation Of Socmentioning
confidence: 99%
“…We chose MFCC in this paper according to the demand of recognition performance and limit of hardware resource. Storage space and recognition time are increased along with the increase of the characteristic dimension, therefore needing to choose the proper characteristic parameters for the sake of good real-time performance in the embedded recognition system, rather than using 39 dimensions feather composed by MFCC [14], [15], first-order difference, second-order difference, normalized transient energy and differential power. We identified the contribution of feature components to recognition performance, and chose the 27 dimensions feather components with biggest contribution, effectively reducing the SRAM area resources consumption by storage characteristic parameters.…”
Section: Implementation Of Socmentioning
confidence: 99%
“…Systems for natural language speech recognition typically utilize three main processing stages (Fig 1) [1]. After the incoming utterance is sampled and digitized in the DSP stage (Phase 1), the generated feature vector enters the Acoustic Modeling stage (Phase 2), where it is compared to a list of senones in the library.…”
Section: Introductionmentioning
confidence: 99%
“…A hardware co-processor is proposed in [60] to boost the performance of the GMM computation in Sphinx3. Reducing the size of the mantissa from 23-bits to 15-bits and 12-bits is proposed in [22] to reduce the acoustic model size, providing a compression ratio of 1.39x and 1.6x respectively. The technique is evaluated using a small vocabulary size of 5000 words, whereas we propose a novel clustering technique to achieve 8x reduction in acoustic model size with a 130k words vocabulary size.…”
Section: Hardware Solutionsmentioning
confidence: 99%