Automatic emotional speech classification

Ververidis, Dimitrios; Kotropoulos, Constantine; Pitas, Ioannis

doi:10.1109/icassp.2004.1326055

Cited by 158 publications

(92 citation statements)

References 2 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Currently there are two main approaches to affective computing: Audio-based techniques to determine emotion from spoken word are described for example in [4,5,6] and video-based techniques that examine and classify facial expressions are described in [7,8,9]. More advanced systems are multi-modal and use a variety of microphones, video cameras as well as other sensors to enlighten the machine with richer signals from the human [10,11,12].…”

Section: Introductionmentioning

confidence: 99%

Gesture-Based Affective Computing on Motion Capture Data

Kapur

Virji‐Babul

et al. 2005

Lecture Notes in Computer Science

164

View full text Add to dashboard Cite

Abstract. This paper presents research using full body skeletal movements captured using video-based sensor technology developed by Vicon Motion Systems, to train a machine to identify different human emotions. The Vicon system uses a series of 6 cameras to capture lightweight markers placed on various points of the body in 3D space, and digitizes movement into x, y, and z displacement data. Gestural data from five subjects was collected depicting four emotions: sadness, joy, anger, and fear. Experimental results with different machine learning techniques show that automatic classification of this data ranges from 84% to 92% depending on how it is calculated. In order to put these automatic classification results into perspective a user study on the human perception of the same data was conducted with average classification accuracy of 93%.

show abstract

Section: Introductionmentioning

confidence: 99%

Gesture-Based Affective Computing on Motion Capture Data

Kapur

Virji‐Babul

et al. 2005

Lecture Notes in Computer Science

164

View full text Add to dashboard Cite

show abstract

“…The Student t-test for unequal variances has also found that the differences in the average assignment ratio per emotion are statistically significant for a 15-fold cross validation experiment. Figure 1 depicts a partition of the 2D feature domain that has been resulted after selecting the five best emotional features by the Sequential Forward Selection algorithm and applying Principal Component Analysis in order to reduce the dimensionality from five dimensions (5D) to two dimensions (2D) [16]. Only the samples which belong to the interquartile range of the probability density function for each class are shown.…”

Section: Resultsmentioning

confidence: 99%

“…A number of 1160 emotional speech patterns are extracted. Each pattern consists of a 90-dimensional feature vector [16]. Each emotional pattern is classified into one of the five primitive emotional states, such as hot anger, happiness, neutral, sadness, and surprise.…”

Section: Datamentioning

confidence: 99%

On the Variants of the Self-Organizing Map That Are Based on Order Statistics

Moschou

Ververidis

Kotropoulos

2006

Artificial Neural Networks – ICANN 2006

View full text Add to dashboard Cite

Abstract. Two well-known variants of the self-organizing map (SOM) that are based on order statistics are the marginal median SOM and the vector median SOM. In the past, their efficiency was demonstrated for color image quantization. In this paper, we employ the well-known IRIS data set and we assess their performance with respect to the accuracy, the average over all neurons mean squared error between the patterns that were assigned to a neuron and the neuron's weight vector, and the Rand index. All figures of merit favor the marginal median SOM and the vector median SOM against the standard SOM. Based on the aforementioned findings, the marginal median SOM and the vector median SOM are used to re-distribute emotional speech patterns from the Danish Emotional Speech database that were originally classified as being neutral to four emotional states such as hot anger, happiness, sadness, and surprise.

show abstract

“…To obtain the statistics of energy feature, we use short-term function to extract the value of energy in each speech frame. Then we can obtain the statistics of energy in the whole speech sample by calculating the energy, such as mean value, max value, variance, variation range, contour of energy [6].…”

Section: Energymentioning

confidence: 99%