2023
DOI: 10.32890/jict2023.22.1.3
|View full text |Cite
|
Sign up to set email alerts
|

Time-Distributed Attention-Layered Convolution Neural Network with Ensemble Learning using Random Forest Classifier for Speech Emotion Recognition

Abstract: Speech Emotion Detection (SER) is a field of identifying human emotions from human speech utterances. Human speech utterancesare a combination of linguistic and non-linguistic information. Nonlinguistic SER provides a generalized solution in human–computerinteraction applications as it overcomes the language barrier. Machine learning and deep learning techniques were previously proposed for classifying emotions using handpicked features. To achieve effective and generalized SER, feature extraction can be perfo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 40 publications
0
1
0
Order By: Relevance
“…Zheng et al (2018a) combined CNN with random forest for recognising emotion in speech, CNN was employed to extract the representative feature of emotion from speech data, while Random Forest was used to classify the extracted feature into basic emotion. A CNN and Random Forest hybrid was also used for speech emotion classification in (Yalamanchili et al, 2023). It was reported that CNN-RF performed better than the CNN model.…”
Section: Hybrid Models and Their Potential In Advancing Deep Learningmentioning
confidence: 99%
“…Zheng et al (2018a) combined CNN with random forest for recognising emotion in speech, CNN was employed to extract the representative feature of emotion from speech data, while Random Forest was used to classify the extracted feature into basic emotion. A CNN and Random Forest hybrid was also used for speech emotion classification in (Yalamanchili et al, 2023). It was reported that CNN-RF performed better than the CNN model.…”
Section: Hybrid Models and Their Potential In Advancing Deep Learningmentioning
confidence: 99%
“…Therefore, a specific type of neural network, the recurrent neural network (RNN), uses previous outputs as inputs and is able to retain the previous time stamp information (Choi et al, 2017). Bhanusree et al (2023) An alternate gating mechanism with a mechanised GRU to resolve this issue was proposed (Chung et al, 2014), incorporating two gate operating mechanisms, the Update and Reset gates. The update gate eliminates the risk of vanishing gradient problems, whereas the reset gate allows for the continuous discarding of stored redundant information.…”
Section: Related Workmentioning
confidence: 99%
“…The propose ensemble model performed better as compared to Random Forest and CNN-LSTM. Bhanusree et al [23] proposed a model that used a time-distributed attention-layered CNN for feature extraction and a Random Forest for classification. The proposed model achieved classification accuracies of 92.2% and 90.3% on the RAVDESS and IEMOCAP datasets, respectively.…”
Section: Introductionmentioning
confidence: 99%