A spatial-temporal framework based on histogram of gradients and optical flow for facial expression recognition in video sequences

Fan, Xijian; Tjahjadi, Tardi

doi:10.1016/j.patcog.2015.04.025

Cited by 75 publications

(31 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The tables show the framework using the simple fusion strategy of two features performs better than using individual feature separately, and the proposed fusion strategy achieves the best performance. In Table 6, we compare the proposed feature with the method of Eskil et al [43], the static method of Lucey et al [32] and our previous work [44], which shows the fused feature achieves an average recognition rate of 88.30% for all seven facial expressions, and outperforms the other methods. Thus, we can also conclude that the combination of two dynamic features improves the recognition rate.…”

Section: Resultsmentioning

confidence: 96%

“…Thus, we can also conclude that the combination of two dynamic features improves the recognition rate. We also conducted an experiment on the MMI dataset, comparing the proposed framework with the method that uses LBP and SVM [37], and the methods in [45] and [44] that are evaluated using the same classification strategy of 10-fold cross-validation. The average recognition rates are shown in Table 7.…”

Section: Resultsmentioning

confidence: 99%

“…The table shows that the proposed framework outperforms all the other five methods. The result for LBP was obtained by using different samples to those used in [37], and using the same strategy of classification introduced in [45] which is also used in [44] and the proposed method. Although CK+ and MMI are two of the most widely used datasets for evaluating facial expression recognition methods, they are both collected in a strict controlled settings with near frontal poses, consistent illumination and posed expressions.…”

Section: Resultsmentioning

confidence: 99%

See 2 more Smart Citations

A dynamic framework based on local Zernike moment and motion history image for facial expression recognition

Fan

Tjahjadi

2017

Pattern Recognition

Self Cite

View full text Add to dashboard Cite

A dynamic descriptor facilitates robust recognition of facial expressions in video sequences. The current two main approaches to the recognition are basic emotion recognition and recognition based on facial action coding system (FACS) action units. In this paper we focus on basic emotion recognition and propose a spatiotemporal feature based on local Zernike moment in the spatial domain using motion change frequency. We also design a dynamic feature comprising motion history image and entropy. To recognise a facial expression, a weighting strategy based on the latter feature and sub-division of the image frame is applied to the former to enhance the dynamic information of facial expression, and followed by the application of the classical support vector machine. Experiments on the CK+ and MMI datasets using leave-one-out cross validation scheme demonstrate that the integrated framework achieves a better performance than using individual descriptor separately. Compared with six state-of-arts methods, the proposed framework demonstrates a superior performance.

show abstract

Section: Resultsmentioning

confidence: 96%

Section: Resultsmentioning

confidence: 99%

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

A dynamic framework based on local Zernike moment and motion history image for facial expression recognition

Fan

Tjahjadi

2017

Pattern Recognition

Self Cite

View full text Add to dashboard Cite

show abstract

“…In [19], Xijian and Tjahjadi extracted spatial pyramid histogram of gradients to three-dimensional facial features. They captured both spatial and motion information of facial expression by integrated the extracted features with dense optical flow.…”

Section: Literature Reviewmentioning

confidence: 99%

Feature Extraction Trends for Intelligent Facial Expression Recognition: A Survey

Azam

Khan

2018

IJCAI

View full text Add to dashboard Cite

Human facial expression is important means of non-verbal communication and conveys a lot more information visually than vocally. In human-machine interaction facial expression recognition plays a vital role. Still facial expression recognition through machines like computer is a difficult task. Face detection, feature extraction and expression classification are the three main stages in the process of Facial Expression Recognition (FER). This survey mainly covers the recent work on FER techniques. It especially focuses on the performance including efficiency and accuracy in face detection, feature extraction and classification methods. Povzetek: V prispevku je predstavljena primerjalna študija tehnik prepoznavanja izrazov obraza.

show abstract

“…Works that exploit video data focus on the importance of the temporal evolution of the input face. The system proposed by Fan and Tjahjadi [3] processes four sub-regions of the face: forehead, eyes/eyebrows, nose and mouth. They used an extension of the spatial pyramid histogram of gradients and dense optical flow to extract spatial and dynamic features from video sequences, and adopted a multi-class SVM-based classifier with one-to-one strategy to recognise facial expressions.…”

Section: Introductionmentioning

confidence: 99%

Coherence constraints in facial expression recognition

Graziani

Melacci

Gori

2019

View full text Add to dashboard Cite

Recognizing facial expressions from static images or video sequences is a widely studied but still challenging problem. The recent progresses obtained by deep neural architectures, or by ensembles of heterogeneous models, have shown that integrating multiple input representations leads to state-of-the-art results. In particular, the appearance and the shape of the input face, or the representations of some face parts, are commonly used to boost the quality of the recognizer. This paper investigates the application of Convolutional Neural Networks (CNNs) with the aim of building a versatile recognizer of expressions in static images that can be further applied to video sequences. We first study the importance of different face parts in the recognition task, focussing on appearance and shape-related features. Then we cast the learning problem in the Semi-Supervised setting, exploiting video data, where only a few frames are supervised. The unsupervised portion of the training data is used to enforce three types of coherence, namely temporal coherence, coherence among the predictions on the face parts and coherence between appearance and shape-based representation. Our experimental analysis shows that coherence constraints can improve the quality of the expression recognizer, thus offering a suitable basis to profitably exploit unsupervised video sequences. Finally we present some examples with occlusions where the shape-based predictor performs better than the appearance one.

show abstract

A spatial-temporal framework based on histogram of gradients and optical flow for facial expression recognition in video sequences

Cited by 75 publications

References 30 publications

A dynamic framework based on local Zernike moment and motion history image for facial expression recognition

A dynamic framework based on local Zernike moment and motion history image for facial expression recognition

Feature Extraction Trends for Intelligent Facial Expression Recognition: A Survey

Coherence constraints in facial expression recognition

Contact Info

Product

Resources

About