2018
DOI: 10.4236/jsip.2018.94015
|View full text |Cite
|
Sign up to set email alerts
|

An Overview of Basics Speech Recognition and Autonomous Approach for Smart Home IOT Low Power Devices

Abstract: Automatic speech recognition, often incorrectly called voice recognition, is a computer based software technique that analyzes audio signals captured by a microphone and translates them into machine interpreted text. Speech processing is based on techniques that need local CPU or cloud computing with an Internet link. An activation word starts the uplink; "OK google", "Alexa", … and voice analysis is not usually suitable for autonomous limited CPU system (16 bits microcontroller) with low energy. To achieve th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 13 publications
0
3
0
Order By: Relevance
“…A less CPU-intensive alternative to full speech recognition is keyword detection, where only a pre-defined vocabulary of spoken words is recognized. Such systems can even run on devices with much lower computational power than smartphones, such as 16-bit microcontrollers [25]. It has been argued that it would still be too taxing for mobile devices to listen out for the "millions or perhaps billions" of targetable keywords that could potentially be dropped in private conversations [51].…”
Section: Technical and Economic Feasibilitymentioning
confidence: 99%
“…A less CPU-intensive alternative to full speech recognition is keyword detection, where only a pre-defined vocabulary of spoken words is recognized. Such systems can even run on devices with much lower computational power than smartphones, such as 16-bit microcontrollers [25]. It has been argued that it would still be too taxing for mobile devices to listen out for the "millions or perhaps billions" of targetable keywords that could potentially be dropped in private conversations [51].…”
Section: Technical and Economic Feasibilitymentioning
confidence: 99%
“…Hence, many applications have been created in which speech to text technology plays an essential role [2][3]. These applications provide services, such as voice search, speech translation, personal assistant, and gaming [4][5]. The ASR systems comprise of four conceptually distinct stages: signal processing, feature extraction, acoustic model, and N-gram language model [6][7].…”
Section: Introductionmentioning
confidence: 99%
“…However, the environment of the audio signals is the main cause of noise and contrast in the speech signal [16]. The noise types may result from hundreds of sources, such as microphone quality, speaker characteristics, background sounds, and dialect differences [4]. Furthermore, various types of noise give different levels of errors, making it difficult to implement a filter technique for each type of noise or training the ASR on them [14].…”
Section: Introductionmentioning
confidence: 99%