Extracting and recognizing complex human movements such as sign language gestures from video sequences is a challenging task. In this paper this kind of a difficult problem is approached with Indian sign language (ISL) videos.A new segmentation algorithm is developed by fusion of features from discrete wavelet transform (DWT) and local binary pattern (LBP). A 2D point cloud is formed from fused features, which represent the local hand shapes in consecutive video frames. We validate the proposed feature extraction model with state of the art features such as HOG, SIFT and SURF for each sign video on the same ANN classifier. We found that the Haar-LBP fused features represent sign video data in better manner compared to HOG, SIFT and SURF. This is due to the combination of global and local features in our proposed feature matrix. The extracted features input the artificial neural network (ANN) classifier with labels forming the corresponding words. The proposed ANN classifier is tested against state of the art classifiers such as Adaboost, support vector machine (SVM) and other ANN methods on different features extracted from the ISL dataset.The classifiers were tested for accuracy and correctness in identifying the signs. The ANN classifier that produced a recognition rate of 92.79% was obtained with maximum training instances, which was far greater than the existing works on sign language with other features and ANN classifier on our ISL dataset.
Enrichment of knowledge becomes possible only through literature search on any topic of interest. It is true even in the case of 'Facial Expression Recognition'. So it is attempted to have a broad look into the origin and development of 'Facial Expression Recognition' in the recent past. It has been developed, into a topic of interest, only after the contributions made by Ekman and Friesen. Though there are many expressions yet to be recognized, significant contributions are made by many eminent scholars to identify the six primary emotions viz. happiness, sadness, fear, disgust, surprise and anger. In this paper, we attempted to bring to lime light some of the important contributions on the subject.
GV (Generative Video) is a framework for the analysis and synthesis of video sequences.In GV, the operational units are not the actual frames in the original sequence; it has world images which have the non redundant information about the video sequences and the ancillary data. The world images and the ancillary data form the generative video representation, the information that is needed to regenerate the original video sequence. A two-step iterative algorithm is used here to obtain the generative video parameters. The first step estimates the background texture for a fixed template. The second step estimates the object template for a fixed background-the solution is given by a simple binary test evaluated at each pixel. The algorithm converges in a few iterations, typically three to five iterations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.