We study the cross-database speech emotion recognition based on online learning. How to apply a classifier trained on acted data to naturalistic data, such as elicited data, remains a major challenge in today’s speech emotion recognition system. We introduce three types of different data sources: first, a basic speech emotion dataset which is collected from acted speech by professional actors and actresses; second, a speaker-independent data set which contains a large number of speakers; third, an elicited speech data set collected from a cognitive task. Acoustic features are extracted from emotional utterances and evaluated by using maximal information coefficient (MIC). A baseline valence and arousal classifier is designed based on Gaussian mixture models. Online training module is implemented by using AdaBoost. While the offline recognizer is trained on the acted data, the online testing data includes the speaker-independent data and the elicited data. Experimental results show that by introducing the online learning module our speech emotion recognition system can be better adapted to new data, which is an important character in real world applications.
SUMMARYTo improve the recognition rate of the speech emotion, new spectral features based on local Hu moments of Gabor spectrograms are proposed, denoted by GSLHu-PCA. Firstly, the logarithmic energy spectrum of the emotional speech is computed. Secondly, the Gabor spectrograms are obtained by convoluting logarithmic energy spectrum with Gabor wavelet. Thirdly, Gabor local Hu moments(GLHu) spectrograms are obtained through block Hu strategy, then discrete cosine transform (DCT) is used to eliminate correlation among components of GLHu spectrograms. Fourthly, statistical features are extracted from cepstral coefficients of GLHu spectrograms, then all the statistical features form a feature vector. Finally, principal component analysis (PCA) is used to reduce redundancy of features. The experimental results on EmoDB and ABC databases validate the effectiveness of GSLHu-PCA.
While great success has been demonstrated in numerous tracking algorithm, some challenging problems still remain such as motion, shape deformation and occlusion. In this paper, the proposed algorithm can work robustly to overcome the occlusion and fast movement in real-world scenarios. A discriminative model based on the Gaussian superpixel model is constructed to descript the change of the target and the background. The calculation of the superpixel's weight uses the special information and a penalize factor. Furthermore, the tracking result is selected from the candidates generated under the framework of particle filter. The candidate with the highest score would be set to the tracking result. Besides, the update strategy which updates according to the trend of candidate's score is adaptive in order to suit for the change of the target. The experimental results demonstrate that the proposed algorithm performs more stable compared with several state-of-the-art algorithms when dealing with occlusion and fast movement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.