2020
DOI: 10.11591/eecsi.v7.2041
|View full text |Cite
|
Sign up to set email alerts
|

Speech Recognition Implementation Using MFCC and DTW Algorithm for Home Automation

Abstract: The use of speech recognition as part of home automation, especially for smart homes, is an exciting thing that is still being developed. That is because of human needs for comfort, convenience, quality of life, and better safety. Speech recognition built in this study is used as a device to control smart home devices by identifying the commands spoken by users, especially in a state of clean speech. The command used is a predetermined consecutive word. For the extraction of voice commands, the MFCC algorithm … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(7 citation statements)
references
References 15 publications
0
7
0
Order By: Relevance
“…Feature extraction in SER is very important as it helps to improve recognition accuracy and performance of speech signals [9]. The features we use in this research are MFCC, chromagram, Mel-spectrogram, spectral contrast, and tonnetz because they are the best features from previous research [4] and MFCC and Mel-spectrogram are widely used in SER [10], [11]. Spectral contrast can be defined as the decibel difference between peaks and valleys in a spectrum [12].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Feature extraction in SER is very important as it helps to improve recognition accuracy and performance of speech signals [9]. The features we use in this research are MFCC, chromagram, Mel-spectrogram, spectral contrast, and tonnetz because they are the best features from previous research [4] and MFCC and Mel-spectrogram are widely used in SER [10], [11]. Spectral contrast can be defined as the decibel difference between peaks and valleys in a spectrum [12].…”
Section: Methodsmentioning
confidence: 99%
“…1921 simultaneously, such as speech characteristics, language content, facial expressions, and body movements, SER is essentially a complicated multimodal task [4].…”
Section: Introductionmentioning
confidence: 99%
“…The MFCC has become one of the effective features in gear fault detection, for instance, Benkedjouh et al extracted the MFCC feature and fed it to the SVM and claimed that the first three MFCC components contain the most defect information of gears [163]. However, based on the research of Abdul et al, 1-13 MFCC are more effective to be taken to train LSTM [164] and Jin et al evaluated some sets of MFCCs (16,21,26,31,36…”
Section: Gear Health Monitoringmentioning
confidence: 99%
“…The DTW is able to calculate the distance between two-time series and is thus a common method to measure similarity [42,43,47] . This method intends to find the optimal alignment of two temporal sequences with different lengths and speeds [48] , which results in better performance and more meaningful discrepancy distances than other approaches [42,49] . The DTW result represents the distance value in the scalar quantity [50] , which is employed to measure how similar two diffusion trends are in time sequences.…”
Section: Comparing the Similarity Of Trend Comparisonmentioning
confidence: 99%