Zhichao Peng scite author profile

Emotion information from speech can effectively help robots understand speaker's intentions in natural human-robot interaction. The human auditory system can easily track temporal dynamics of emotion by perceiving the intensity and fundamental frequency of speech, and focus on the salient emotion regions. Therefore, speech emotion recognition combined with the auditory mechanism and attention mechanism may be an effective way. Some previous studies used auditory-based static features to identify emotion while ignoring the emotion dynamics. Some other studies used attention models to capture the salient regions of emotion while ignoring cognitive continuity. To fully utilize the auditory and attention mechanism, we first investigate temporal modulation cues from auditory front-ends and then propose a joint deep learning model that combines 3D convolutions and attention-based sliding recurrent neural networks (ASRNNs) for emotion recognition. Our experiments on the IEMOCAP and MSP-IMPROV datasets indicate that the proposed method can be effectively used to recognize the emotions of speech from temporal modulation cues. The subjective evaluation shows that the attention patterns of the attention model are basically consistent with human behaviors in recognizing the emotions.

show abstract

Asymptotic Preserving IMEX-DG-S Schemes for Linear Kinetic Transport Equations Based on Schur Complement

Peng¹,

Li²

2021

SIAM J. Sci. Comput.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhichao Peng

Stability-enhanced AP IMEX-LDG schemes for linear kinetic transport equations under a diffusive scaling

Multi-kernel SVM based depression recognition using social media data

Speech Emotion Recognition Using 3D Convolutions and Attention-Based Sliding Recurrent Networks With Auditory Front-Ends

Asymptotic Preserving IMEX-DG-S Schemes for Linear Kinetic Transport Equations Based on Schur Complement

Contact Info

Product

Resources

About