Abstract-Facial expressions are an important way through which humans interact socially. Building a system capable of automatically recognizing facial expressions from images and video has been an intense field of study in recent years. Interpreting such expressions remains challenging and much research is needed about the way they relate to human affect. This paper presents a general overview of automatic RGB, 3D, thermal and multimodal facial expression analysis. We define a new taxonomy for the field, encompassing all steps from face detection to facial expression recognition, and describe and classify the state of the art methods accordingly. We also present the important datasets and the bench-marking of most influential methods. We conclude with a general discussion about trends, important questions and future lines of research.
Human motion recognition is one of the most important branches of human-centered research activities. In recent years, motion recognition based on RGB-D data has attracted much attention. Along with the development in artificial intelligence, deep learning techniques have gained remarkable success in computer vision. In particular, convolutional neural networks (CNN) have achieved great success for image-based tasks, and recurrent neural networks (RNN) are renowned for sequence-based problems. Specifically, deep learning methods based on the CNN and RNN architectures have been adopted for motion recognition using RGB-D data. In this paper, a detailed overview of recent advances in RGB-D-based motion recognition is presented. The reviewed methods are broadly categorized into four groups, depending on the modality adopted for recognition: RGB-based, depth-based, skeleton-based and RGB+D-based.As a survey focused on the application of deep learning to RGB-D-based motion recognition, we explicitly discuss the advantages and limitations of existing techniques. Particularly, we highlighted the methods of encoding spatialtemporal-structural information inherent in video sequence, and discuss potential directions for future research. body parts, in contrast with the few body parts that involved in gesture. An activity is composed by a sequence of actions. An interaction is a type of motion performed by two actors; one actor is human while the other may be human or an object. This implies that the interaction category will include human-human or human-object interaction."Hugging each other" and "playing guitar" are examples of these two kinds of interaction, respectively. Group activity is the most complex type of activity, and it may be a combination of gestures, actions and interactions. Necessarily, it involves more than two humans and from zero to multiple objects. Examples of group activities would include "two teams playing basketball" and "group meeting".Early research on human motion recognition was dominated by the analysis of still images or videos [2,144,132,99,44,176]. Most of these efforts used color and texture cues in 2D images for recognition. However, the task remains challenging due to problems posed by background clutter, partial occlusion, view-point, lighting changes, execution rate and biometric variation. This challenge remains even with current deep learning approaches [49,4].With the recent development of cost-effective RGB-D sensors, such as Microsoft Kinect TM and Asus Xtion TM , RGB-D-based motion recognition has attracted much attention. This is largely because the extra dimension (depth) is insensitive to illumination changes and includes rich 3D structural information of the scene. Additionally, 3D positions of body joints can be estimated from depth maps [114]. As a consequence, several methods based on RGB-D data have been proposed and the approach has proven to be a promising direction for human motion analysis.Several survey papers have summarized the research on human motion recognition...
A common way to model multiclass classification problems is to design a set of binary classifiers and to combine them. Error-Correcting Output Codes (ECOC) represent a successful framework to deal with these type of problems. Recent works in the ECOC framework showed significant performance improvements by means of new problem-dependent designs based on the ternary ECOC framework. The ternary framework contains a larger set of binary problems because of the use of a "do not care" symbol that allows us to ignore some classes by a given classifier. However, there are no proper studies that analyze the effect of the new symbol at the decoding step. In this paper, we present a taxonomy that embeds all binary and ternary ECOC decoding strategies into four groups. We show that the zero symbol introduces two kinds of biases that require redefinition of the decoding design. A new type of decoding measure is proposed, and two novel decoding strategies are defined. We evaluate the state-of-the-art coding and decoding strategies over a set of UCI Machine Learning Repository data sets and into a real traffic sign categorization problem. The experimental results show that, following the new decoding strategies, the performance of the ECOC design is significantly improved.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.