2021 3rd International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA) 2021
DOI: 10.1109/hora52670.2021.9461395
|View full text |Cite
|
Sign up to set email alerts
|

Effect of Dataset Size on Deep Learning in Voice Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 5 publications
0
5
0
Order By: Relevance
“…The significance of the proposed voice augmentation technique was compared with the ordinary voice recognition augmentation and regularization techniques. Although the related works [30]- [31], [33] - [34] showed that the CNN model is an exemplary model for vocabulary-size speech recognition, we have proven that the fusion CNN-LSTM model is superior to the pure CNN and pure LSTM for two separate datasets. The LSTM model improved the inconsistent performance of the CNN model when CNN and LSTM were hybridized together.…”
Section: Introductionmentioning
confidence: 59%
See 1 more Smart Citation
“…The significance of the proposed voice augmentation technique was compared with the ordinary voice recognition augmentation and regularization techniques. Although the related works [30]- [31], [33] - [34] showed that the CNN model is an exemplary model for vocabulary-size speech recognition, we have proven that the fusion CNN-LSTM model is superior to the pure CNN and pure LSTM for two separate datasets. The LSTM model improved the inconsistent performance of the CNN model when CNN and LSTM were hybridized together.…”
Section: Introductionmentioning
confidence: 59%
“…Similarly, Wubet and Lian [32] showed that CNN is better than the SVM model for keyword recognition, and surprisingly, a hybrid of CNN-SVM outperformed pure CNN and pure SVM. Cayir and Navruz [33] investigated the influence of a limited size dataset for voice command recognition using 12 different voice commands ("down", "forward", "follow", "go", "left", "on", "off", "right", "stop", "up", and "yes"). Their experimental results showed that when the test dataset included native Turkish speakers, the test accuracy was 94.64% for a large dataset and 64.81% for a small dataset.…”
Section: Related Workmentioning
confidence: 99%
“…Given these circumstances, an ASR model should be designed to generalize effectively and recognize a wide range of voices even with limited data. In this context, employing a dataset rich in accent diversity will enhance the generalization capabilities of the designed model ( Cayir & Navruz, 2021 ). Therefore, our study aimed to utilize extensive datasets encompassing participants from various accent groups.…”
Section: Introductionmentioning
confidence: 99%
“…However, existing deep learning methods still have limitations in side-channel analysis. Deep learning has been well-established for tasks such as image processing [13,14] and speech recognition [15,16], but its application in cryptographic algorithms is relatively limited. Classical handwritten digit classification involves 10 classes, while the analysis of cryptographic keys requires exponentially more classification categories, making existing classification models less suitable.…”
Section: Introductionmentioning
confidence: 99%