2018
DOI: 10.11591/ijeecs.v10.i2.pp554-561
|View full text |Cite
|
Sign up to set email alerts
|

Speech Emotion Recognition Using Deep Feedforward Neural Network

Abstract: Speech emotion recognition (SER) is currently a research hotspot due to its challenging nature but bountiful future prospects. The objective of this research is to utilize Deep Neural Networks (DNNs) to recognize human speech emotion. First, the chosen speech feature Mel-frequency cepstral coefficient (MFCC) were extracted from raw audio data. Second, the speech features extracted were fed into the DNN to train the network. The trained network was then tested onto a set of labelled emotion speech audio and the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
22
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
7
3

Relationship

1
9

Authors

Journals

citations
Cited by 32 publications
(22 citation statements)
references
References 12 publications
0
22
0
Order By: Relevance
“…The scope for future improvements is very appealing in this field. Different multimodel deep learning techniques can be used along with different architectures to improve the performance parameters [20][21][22][23][24][25][26][27]. Apart from recognizing the emotions only, there can be further addition of intensity scale.…”
Section: Resultsmentioning
confidence: 99%
“…The scope for future improvements is very appealing in this field. Different multimodel deep learning techniques can be used along with different architectures to improve the performance parameters [20][21][22][23][24][25][26][27]. Apart from recognizing the emotions only, there can be further addition of intensity scale.…”
Section: Resultsmentioning
confidence: 99%
“…They examined various feature sets for distinguishing depression from unconstrained speech and discovered loudness and intensity features to be the most discriminative. According to the technique used by [36], the Speech Emotion Recognition (SER) rate over five emotions, namely happy, angry, sad, fear, and neutral, achieved was with 91.7% accuracy by introducing Voice Activity Detection (VAD) in preprocessing. For creating a model based on speech indicators, it is vital to use a formalized dataset.…”
Section: Speech Indicatorsmentioning
confidence: 99%
“…This has sparked interest in using automation to identify and classify animals from their vocalizations. This area of bio-acoustics signal analysis has been mostly concentrated on using techniques similar to those used for processing speech signals [2], [3]. Signal segmentation of bio-acoustic sounds is routinely performed in order to isolate syllables [4], [5].…”
Section: Introductionmentioning
confidence: 99%