Speech Emotion Recognition Based on Acoustic Segment Model

Zheng, Siyuan; Du, Jun; Zhou, Hengshun; Bai, Xue; Lee, Chin‐Hui; Li, Shipeng

doi:10.1109/iscslp49672.2021.9362119

Cited by 6 publications

(4 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A deep learning-based language model outperforms standard methods when applied to the Bag of Words model. Siyuan Zheng et al (2021) [17] To address this problem, they present in this work an acoustic segment model (ASM)-based technique for speech emotion recognition (SER).This research suggests a brand-new SER paradigm based on ASM. Topic models like LSA in the field of information retrieval can process the relationship between a document and a word.…”

Section: General Architecture Of Speech Emotion Detectionmentioning

confidence: 99%

A Review on Recognizing of Positive or Negative Emotion Based on Speech

Kumar¹,

Rani²,

Sai³

et al. 2023

IJRASET

View full text Add to dashboard Cite

Emotions are the feelings of a person and the reaction to a situation. People can verbally or nonverbally convey their feelings. In the literature ofSpeech emotion recognition, many of the techniques have used traditional ways to detect emotions, this paper present different way of recognizing emotions, by speech signals are processed by CNN to extract features, the extracted features are then used to input SVM. The SVM outputs the predicted emotions. From this approach we improved accuracy by testing and training the models based on the input audio data

show abstract

Section: General Architecture Of Speech Emotion Detectionmentioning

confidence: 99%

A Review on Recognizing of Positive or Negative Emotion Based on Speech

Kumar¹,

Rani²,

Sai³

et al. 2023

IJRASET

View full text Add to dashboard Cite

show abstract

“…Besides, studies were using CNN in combination with LSTM [17][18][19][20]. CNN, DCNN, and multi-channel CNN models were used in [21][22][23][24]. A combination of CNN and RNN models to get the CRNN model was used in [25].…”

Section: Related Workmentioning

confidence: 99%

“…For the feature parameters that have been used for emotion recognition, some studies combine the features of speech and textual data. Those are the studies in [13,15,21,24,27,28]. There are a large number of studies that have used a spectrogram, a Mel-spectrogram, or a combination of a spectrogram and a MFCC as feature parameters [10,14,17,18,22,23,25,[29][30][31][32][33][34].…”

Section: Related Workmentioning

confidence: 99%

Emotional Speech Recognition Using Deep Neural Networks

Van

Xuan

et al. 2022

Sensors

View full text Add to dashboard Cite

The expression of emotions in human communication plays a very important role in the information that needs to be conveyed to the partner. The forms of expression of human emotions are very rich. It could be body language, facial expressions, eye contact, laughter, and tone of voice. The languages of the world’s peoples are different, but even without understanding a language in communication, people can almost understand part of the message that the other partner wants to convey with emotional expressions as mentioned. Among the forms of human emotional expression, the expression of emotions through voice is perhaps the most studied. This article presents our research on speech emotion recognition using deep neural networks such as CNN, CRNN, and GRU. We used the Interactive Emotional Dyadic Motion Capture (IEMOCAP) corpus for the study with four emotions: anger, happiness, sadness, and neutrality. The feature parameters used for recognition include the Mel spectral coefficients and other parameters related to the spectrum and the intensity of the speech signal. The data augmentation was used by changing the voice and adding white noise. The results show that the GRU model gave the highest average recognition accuracy of 97.47%. This result is superior to existing studies on speech emotion recognition with the IEMOCAP corpus.

show abstract

“…With the power of technologies, we have today; new novel ways have been introduced to interpret emotions. The speech emotion recognition system is useful in psychiatric diagnosis, lie detection, call center conversations, customer voice review, and voice messages [1].…”

Section: Introductionmentioning

confidence: 99%