The electroencephalogram (EEG) has great attraction in emotion recognition studies due to its resistance to deceptive actions of humans. This is one of the most significant advantages of brain signals in comparison to visual or speech signals in the emotion recognition context. A major challenge in EEG-based emotion recognition is that EEG recordings exhibit varying distributions for different people as well as for the same person at different time instances. This nonstationary nature of EEG limits the accuracy of it when subject independency is the priority. The aim of this study is to increase the subject-independent recognition accuracy by exploiting pretrained state-of-the-art Convolutional Neural Network (CNN) architectures. Unlike similar studies that extract spectral band power features from the EEG readings, raw EEG data is used in our study after applying windowing, pre-adjustments and normalization. Removing manual feature extraction from the training system overcomes the risk of eliminating hidden features in the raw data and helps leverage the deep neural network’s power in uncovering unknown features. To improve the classification accuracy further, a median filter is used to eliminate the false detections along a prediction interval of emotions. This method yields a mean cross-subject accuracy of 86.56% and 78.34% on the Shanghai Jiao Tong University Emotion EEG Dataset (SEED) for two and three emotion classes, respectively. It also yields a mean cross-subject accuracy of 72.81% on the Database for Emotion Analysis using Physiological Signals (DEAP) and 81.8% on the Loughborough University Multimodal Emotion Dataset (LUMED) for two emotion classes. Furthermore, the recognition model that has been trained using the SEED dataset was tested with the DEAP dataset, which yields a mean prediction accuracy of 58.1% across all subjects and emotion classes. Results show that in terms of classification accuracy, the proposed approach is superior to, or on par with, the reference subject-independent EEG emotion recognition studies identified in literature and has limited complexity due to the elimination of the need for feature extraction.
Multimodal emotion recognition has gained traction in affective computing research community to overcome the limitations posed by the processing a single form of data and to increase recognition robustness. In this study, a novel emotion recognition system is introduced, which is based on multiple modalities including facial expressions, galvanic skin response (GSR) and electroencephalogram (EEG). This method follows a hybrid fusion strategy and yields a maximum one-subject-out accuracy of 81.2% and a mean accuracy of 74.2% on our bespoke multimodal emotion dataset (LUMED-2) for 3 emotion classes: sad, neutral and happy. Similarly, our approach yields a maximum one-subject-out accuracy of 91.5% and a mean accuracy of 53.8% on the Database for Emotion Analysis using Physiological Signals (DEAP) for varying numbers of emotion classes, 4 in average, including angry, disgust, afraid, happy, neutral, sad and surprised. The presented model is particularly useful in determining the correct emotional state in the case of natural deceptive facial expressions. In terms of emotion recognition accuracy, this study is superior to, or on par with, the reference subject-independent multimodal emotion recognition studies introduced in the literature.
The quality assessment of impaired stereoscopic video is a key element in designing and deploying advanced immersive media distribution platforms. A widely accepted quality metric to measure impairments of stereoscopic video is, however, still to be developed. As a step toward finding a solution to this problem, this paper proposes a full reference stereoscopic video quality metric to measure the perceptual quality of compressed stereoscopic video. A comprehensive set of subjective experiments is performed with 14 different stereoscopic video sequences, which are encoded using both the H.264 and high efficiency video coding compliant video codecs, to develop a subjective test results database of 116 test stimuli. The subjective results are analyzed using statistical techniques to uncover different patterns of subjective scoring for symmetrically and asymmetrically encoded stereoscopic video. The subjective result database is subsequently used for training and validating a simple but effective stereoscopic video quality metric considering heuristics of binocular vision. The proposed metric performs significantly better than state-of-the-art stereoscopic image and video quality metrics in predicting the subjective scores. The proposed metric and the subjective result database will be made publicly available, and it is expected that the proposed metric and the subjective assessments will have important uses in advanced 3D media delivery systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.