Facial expression recognition (FER) is a very challenging problem in computer vision. Although extensive research has been conducted to improve FER performance in recent years, there is still room for improvement. A common goal of FER is to classify a given face image into one of seven emotion categories: angry, disgust, fear, happy, neutral, sad, and surprise. In this paper, we propose to use a simple multi-layer perceptron (MLP) classifier that determines whether the current classification result is reliable or not. If the current classification result is determined as unreliable, we use the given face image as a query to search for similar images. In particular, facial action units are used to retrieve the images with a similar facial expression. Then, another MLP is trained to predict final emotion category by aggregating classification output vectors of the query image and its retrieved similar images. Experimental results on FER2013 dataset demonstrate that the performance of the state-of-the-art networks can be further improved by our proposed method.
INDEX TERMSConvolutional neural networks, facial expression recognition, image retrieval, facial action units.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.