Abstract:It is a common behavior for human beings to use gestures as a means of expression, as a complement to speaking, or as a self-contained communication mode. In the field of Human-Computer Interaction, this behavior can be adopted to build alternative interfaces, aiming to ease the relationship between the human element and the computational element. Currently, various gesture recognition techniques are described in the technical literature; however, the validation studies of these techniques are usually performe… Show more
“…Since SVMs [29] have gained much attention in recent times due to their powerful generalization capabilities as gesture classifiers [16], [18] we evaluate different feature learning schemes using SVMs. The following approaches are evaluated in this paper using our dataset: (i) The authors in [30], [31], [32] use Hu Invariant Moments for feature learning from images of different objects and gestures; (ii) Unsupervised feature learning is applied by authors in [33] using the Spatial Pyramid (generally referred to as Bag of Features or Bag of Words (BoW)) a combination of SIFT and k-means; (iii) Shape properties of objects such as roundness, form factor, compactness, eccentricity, perimeter, solidity etc are used by the authors in [31], [34]; (iv) Skeletonization has been proposed by the authors in [35], [36] for gesture recognition tasks, such as the counting the number of fingers; (v) Pyramid of Histogram Oriented Gradients (PHOG) [37], a variant of the famous HOG descriptor [38], gained popularity for its vectorized HOG feature learning approach; (vi) The Fast Fourier Transform (FFT) has been used by the authors in [39] to represent the shape of the hand contour in images using the spatial domain; (vii) CNNs called Tiled CNNs [40] are supervised feature learners and classifiers able to learn complex invariances such as scale and rotational invariance.…”
Section: A Existing Approachesmentioning
confidence: 99%
“…Recently, different research efforts on 2D appearance model-based methods for gesture recognition have emerged [9], [10], [11], [12], [13], [14], [15], amongst which supervised and unsupervised learning techniques such as Neural Networks (NNs), Support Vector Machine (SVMs) and NearestNeighbor [16], [17], [18] classifiers have gained familiarity. However, feature learning is not a part of such classification schemes and needs to be performed separately to compute features such as edges, gradients, pixel intensities and object shape.…”
Abstract-Automatic recognition of gestures using computer vision is important for many real-world applications such as sign language recognition and human-robot interaction (HRI). Our goal is a real-time hand gesture-based HRI interface for mobile robots. We use a state-of-the-art big and deep neural network (NN) combining convolution and max-pooling (MPCNN) for supervised feature learning and classification of hand gestures given by humans to mobile robots using colored gloves. The hand contour is retrieved by color segmentation, then smoothened by morphological image processing which eliminates noisy edges. Our big and deep MPCNN classifies 6 gesture classes with 96% accuracy, nearly three times better than the nearest competitor. Experiments with mobile robots using an ARM 11 533MHz processor achieve real-time gesture recognition performance.
“…Since SVMs [29] have gained much attention in recent times due to their powerful generalization capabilities as gesture classifiers [16], [18] we evaluate different feature learning schemes using SVMs. The following approaches are evaluated in this paper using our dataset: (i) The authors in [30], [31], [32] use Hu Invariant Moments for feature learning from images of different objects and gestures; (ii) Unsupervised feature learning is applied by authors in [33] using the Spatial Pyramid (generally referred to as Bag of Features or Bag of Words (BoW)) a combination of SIFT and k-means; (iii) Shape properties of objects such as roundness, form factor, compactness, eccentricity, perimeter, solidity etc are used by the authors in [31], [34]; (iv) Skeletonization has been proposed by the authors in [35], [36] for gesture recognition tasks, such as the counting the number of fingers; (v) Pyramid of Histogram Oriented Gradients (PHOG) [37], a variant of the famous HOG descriptor [38], gained popularity for its vectorized HOG feature learning approach; (vi) The Fast Fourier Transform (FFT) has been used by the authors in [39] to represent the shape of the hand contour in images using the spatial domain; (vii) CNNs called Tiled CNNs [40] are supervised feature learners and classifiers able to learn complex invariances such as scale and rotational invariance.…”
Section: A Existing Approachesmentioning
confidence: 99%
“…Recently, different research efforts on 2D appearance model-based methods for gesture recognition have emerged [9], [10], [11], [12], [13], [14], [15], amongst which supervised and unsupervised learning techniques such as Neural Networks (NNs), Support Vector Machine (SVMs) and NearestNeighbor [16], [17], [18] classifiers have gained familiarity. However, feature learning is not a part of such classification schemes and needs to be performed separately to compute features such as edges, gradients, pixel intensities and object shape.…”
Abstract-Automatic recognition of gestures using computer vision is important for many real-world applications such as sign language recognition and human-robot interaction (HRI). Our goal is a real-time hand gesture-based HRI interface for mobile robots. We use a state-of-the-art big and deep neural network (NN) combining convolution and max-pooling (MPCNN) for supervised feature learning and classification of hand gestures given by humans to mobile robots using colored gloves. The hand contour is retrieved by color segmentation, then smoothened by morphological image processing which eliminates noisy edges. Our big and deep MPCNN classifies 6 gesture classes with 96% accuracy, nearly three times better than the nearest competitor. Experiments with mobile robots using an ARM 11 533MHz processor achieve real-time gesture recognition performance.
“…But the space of HAR is limited into a small part of the environment, where the HAR sensor has been placed. Obviously, absolute anchoring approaches are principally vision-based methods and can be further categorized by the function of modeling variations in time: direct classification, which classifies image features without using information about the time factor and HAR is usually performed directly for each frame individually such as in [12], [13], [14] and [15]. Furthermore, temporal state-space methods where temporal data appears as a particular dimension and where every observation is equivalent to an image representation in given a time such as in [16], [17], [18] and [19].…”
Human activity recognition from movementrelated signals or image sequences is a quite challenging problem in computer vision. Human activities can be decoded from various set of communication channels but it is proved that the head has a highlighted role to emphasize the message that is being communicated. Recognizing activities from head movements can be suitable, because the head has a near constant shape and appearance during the communication. The spatiotemporal segmentation of head movements can be also done by analyzing the trajectories. In this study, we give a general model for description and recognition of head movements. The basic idea has been extended by introducing a human activity database to make better decisions during the recognition. The proposed approach takes into consideration facial regions that encode essential information about head movements. The essence of head movements is extracted from motion history image representation and aligned by dynamic time warping. The efficiency of our system is also demonstrated by the recognition of head-drawn letters.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.