2018
DOI: 10.1109/lsp.2018.2860246
|View full text |Cite
|
Sign up to set email alerts
|

3-D Convolutional Recurrent Neural Networks With Attention Model for Speech Emotion Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
227
0
1

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 401 publications
(252 citation statements)
references
References 9 publications
2
227
0
1
Order By: Relevance
“…Similarly, Zhao et al [22] implemented an attention layer right after the RNNs to extract the most interesting acoustic parts in the continuum. Apart from the RNNs and DNNs, the attention layer was also integrated with CNNs [23,24]. All these works, nevertheless, were conducted under the usage of traditional hand-crafted features, and have not explicitly investigated the differences of attention in an MTL framework.…”
Section: Related Workmentioning
confidence: 99%
“…Similarly, Zhao et al [22] implemented an attention layer right after the RNNs to extract the most interesting acoustic parts in the continuum. Apart from the RNNs and DNNs, the attention layer was also integrated with CNNs [23,24]. All these works, nevertheless, were conducted under the usage of traditional hand-crafted features, and have not explicitly investigated the differences of attention in an MTL framework.…”
Section: Related Workmentioning
confidence: 99%
“…To train the different models with different features to increase the accuracy up to 71%, but they used same architecture, which is used for computer vision-related tasks. Chen et al [42] developed a system for SER using 3D CNN architecture and trained model to increase the accuracy of SER, but he also used the pooling scheme to develop the network. Due to this limitation, we explored the plain CNN architecture to propose a new model for SER to give well and outperform results from state-of-the-art.…”
Section: Discussionmentioning
confidence: 99%
“…As can be seen in Figure 3, attention layer has been added after the LSTM layer to score the importance of the sequence of high-level features to the final decision [36].…”
Section: Attention Layermentioning
confidence: 99%