LBVCNN: Local Binary Volume Convolutional Neural Network for Facial Expression Recognition From Image Sequences

Kumawat, Sudhakar; Verma, Manisha; Raman, Shanmuganathan

doi:10.1109/cvprw.2019.00030

Cited by 36 publications

(31 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The average accuracy of 10 runs for seven-class and eight-class are reported. Among the many previous works, some works such as STRNN [ 42 ], LBVCNN [ 41 ], TPOEM [ 38 ], PHRNN-MSCNN [ 39 ], and SAANet [ 43 ] used image sequence as their experimental data, while others used a static image. Although Specific preprocessing [ 16 ], ALAW [ 22 ], Feature loss [ 28 ], OAENet [ 35 ], and S-DSRN [ 23 ] used seven expressions, contempt expression is replaced with neural.…”

Section: Methodsmentioning

confidence: 99%

“…To overcome the limitation that traditional LBP can lose the neighboring pixels related to different scales that can affect the texture of facial images, Yasmin et al [ 30 ] proposed a new extended LBP method based on the bitwise “AND” operation of two rotational kernels to extract facial features. In view of satisfactory performance of the LBP operator, the CNNs that integrate advantages of the LBP have been developed [ 41 , 55 , 56 ]. Lyons et al [ 27 ] used a multiscale, multiorientation set of Gabor filters to code facial expression images through comparing the similarity space derived from semantic ratings of the images by human observers with the one derived from Gabor representation; authors believed that the latter shows a significant degree of psychological plausibility.…”

Section: Related Workmentioning

confidence: 99%

“…Facial expressions can be divided into six basic emotions, namely, anger (An); disgust (Di); fear (Fe); happiness (Ha); sadness (Sa); surprise (Su); and one neutral (Ne) emotion [ 9 ], contempt (Co), was subsequently added as one of the basic emotions [ 10 ]. Recognition of these emotions can be categorized into image-based [ 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 ] and video-based [ 38 , 39 , 40 , 41 , 42 , 43 ] approaches. Image-based approaches only use information about the static input image to determine the category of facial expression; on the other hand, except when the spatial features extracted from a static image are available, video-based approaches can also use temporal information of a dynamic image sequence to capture the temporal changes of facial appearance when some facial expression occurs.…”

Section: Introductionmentioning

confidence: 99%

“…FER can also be divided into the traditional method [ 15 , 27 , 30 , 31 , 32 , 38 , 40 ], deep learning method [ 16 , 17 , 18 , 20 , 21 , 23 , 24 , 25 , 26 , 35 , 36 , 39 , 41 , 42 , 43 ], or a combination of the two [ 11 , 12 , 22 , 28 , 29 , 33 , 37 ]. Traditional FER systems usually involve facial representation and expression classification.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Patch Attention Layer of Embedding Handcrafted Features in CNN for Facial Expression Recognition

Liang

Liu

et al. 2021

Sensors

View full text Add to dashboard Cite

Recognizing facial expression has attracted much more attention due to its broad range of applications in human–computer interaction systems. Although facial representation is crucial to final recognition accuracy, traditional handcrafted representations only reflect shallow characteristics and it is uncertain whether the convolutional layer can extract better ones. In addition, the policy that weights are shared across a whole image is improper for structured face images. To overcome such limitations, a novel method based on patches of interest, the Patch Attention Layer (PAL) of embedding handcrafted features, is proposed to learn the local shallow facial features of each patch on face images. Firstly, a handcrafted feature, Gabor surface feature (GSF), is extracted by convolving the input face image with a set of predefined Gabor filters. Secondly, the generated feature is segmented as nonoverlapped patches that can capture local shallow features by the strategy of using different local patches with different filters. Then, the weighted shallow features are fed into the remaining convolutional layers to capture high-level features. Our method can be carried out directly on a static image without facial landmark information, and the preprocessing step is very simple. Experiments on four databases show that our method achieved very competitive performance (Extended Cohn–Kanade database (CK+): 98.93%; Oulu-CASIA: 97.57%; Japanese Female Facial Expressions database (JAFFE): 93.38%; and RAF-DB: 86.8%) compared to other state-of-the-art methods.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Patch Attention Layer of Embedding Handcrafted Features in CNN for Facial Expression Recognition

Liang

Liu

et al. 2021

Sensors

View full text Add to dashboard Cite

show abstract

“…Although these approaches are effective methods in extracting spatial information, they fail to capture morphological and contextual variations in the expression process. Recent methods aim to solve this problem by using massive datasets to obtain more efficient features of FER [9][10][11][12][13][14][15]. Some researchers use multimodal fusion to recognize emotions, such as voices, expressions, and actions [16].…”

Section: Related Workmentioning

confidence: 99%

Facial Expression Recognition Based on Multi-Features Cooperative Deep Convolutional Network

Zhang

et al. 2021

Applied Sciences

View full text Add to dashboard Cite

This paper addresses the problem of Facial Expression Recognition (FER), focusing on unobvious facial movements. Traditional methods often cause overfitting problems or incomplete information due to insufficient data and manual selection of features. Instead, our proposed network, which is called the Multi-features Cooperative Deep Convolutional Network (MC-DCN), maintains focus on the overall feature of the face and the trend of key parts. The processing of video data is the first stage. The method of ensemble of regression trees (ERT) is used to obtain the overall contour of the face. Then, the attention model is used to pick up the parts of face that are more susceptible to expressions. Under the combined effect of these two methods, the image which can be called a local feature map is obtained. After that, the video data are sent to MC-DCN, containing parallel sub-networks. While the overall spatiotemporal characteristics of facial expressions are obtained through the sequence of images, the selection of keys parts can better learn the changes in facial expressions brought about by subtle facial movements. By combining local features and global features, the proposed method can acquire more information, leading to better performance. The experimental results show that MC-DCN can achieve recognition rates of 95%, 78.6% and 78.3% on the three datasets SAVEE, MMI, and edited GEMEP, respectively.

show abstract

Attention-Based Global-Local Graph Learning for Dynamic Facial Expression Recognition

Xie,

Li,

Guo

et al. 2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

LBVCNN: Local Binary Volume Convolutional Neural Network for Facial Expression Recognition From Image Sequences

Cited by 36 publications

References 40 publications

Patch Attention Layer of Embedding Handcrafted Features in CNN for Facial Expression Recognition

Patch Attention Layer of Embedding Handcrafted Features in CNN for Facial Expression Recognition

Facial Expression Recognition Based on Multi-Features Cooperative Deep Convolutional Network

Attention-Based Global-Local Graph Learning for Dynamic Facial Expression Recognition

Contact Info

Product

Resources

About