2015 34th Chinese Control Conference (CCC) 2015
DOI: 10.1109/chicc.2015.7260187
|View full text |Cite
|
Sign up to set email alerts
|

Speech-oriented negative emotion recognition

Abstract: Standard Back Propagation(BP) network is easily trapped into a local optimal solution. Two main approaches are commonly used to improve its appearance. One is to employ numerical optimization methods, this approach is simple and fast, but severe with computational storage, in addition could not guarantee convergence. Another is to employ gradient descent methods, this approach can achieve a global minimum with high probability, but more likely to cause oscillations, and the parameters are hard to determine. Mo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 16 publications
(13 reference statements)
0
0
0
Order By: Relevance
“…Gender data was also leveraged in [11], and data augmentation obtained through an adversarial network has been reported as a successful strategy [12]. In [13], a much smaller feature set comprising only four statistical values for the estimated pitch, the first two formants, the energy, and the zero-crossing rate (ZCR) were used together with feed-forward MLP neural networks, but trained with a modified backpropagation algorithm based on genetic algorithm (GA) principles, with a focus on negative emotions. A different approach was taken in [14], operating on the raw time-domain audio signal to extract linear prediction descriptors processed through a Gammatone filterbank before being applied to a spiking neural network (SNN) and liquid state machine (LSM) hybrid model.…”
Section: Introductionmentioning
confidence: 99%
“…Gender data was also leveraged in [11], and data augmentation obtained through an adversarial network has been reported as a successful strategy [12]. In [13], a much smaller feature set comprising only four statistical values for the estimated pitch, the first two formants, the energy, and the zero-crossing rate (ZCR) were used together with feed-forward MLP neural networks, but trained with a modified backpropagation algorithm based on genetic algorithm (GA) principles, with a focus on negative emotions. A different approach was taken in [14], operating on the raw time-domain audio signal to extract linear prediction descriptors processed through a Gammatone filterbank before being applied to a spiking neural network (SNN) and liquid state machine (LSM) hybrid model.…”
Section: Introductionmentioning
confidence: 99%