Existing methods on facial expression recognition (FER) are mainly trained in the setting when all expression classes are fixed in advance. However, in real applications, expression classes are becoming increasingly fine-grained and incremental. To deal with sequential expression classes, we can fine-tune or retrain these models, but this often results in poor performance or large computing resources consumption. To address these problems, we develop an Incremental Facial Expression Recognition Network (IExpress-Net), which can learn a competitive multi-class classifier at any time with a lower requirement of computing resources. Specifically, IExpressNet consists of two novel components. First, we construct an exemplar set by dynamically selecting representative samples from old expression classes. Then, the exemplar set and new expression classes samples constitute the training set. Second, we design a novel center-expression-distilled loss. As for facial expression in the wild, center-expression-distilled loss enhances the discriminative power of the deeply learned features and prevents catastrophic forgetting. Extensive experiments are conducted on two large-scale FER datasets in the wild, RAF-DB and AffectNet. The results demonstrate the superiority of the proposed method as compared to state-of-the-art incremental learning approaches. CCS CONCEPTS • Information systems → Multimedia information systems; • Human-centered computing → Human computer interaction (HCI).
Existing methods on facial expression recognition (FER) are mainly trained in the setting when multi-class data is available. However, to detect the alien expressions that are absent during training, this type of methods cannot work. To address this problem, we develop a Hierarchical Spatial One Class Facial Expression Recognition Network (HS-OCFER) which can construct the decision boundary of a given expression class (called normal class) by training on only one-class data. Specifically, HS-OCFER consists of three novel components. First, hierarchical bottleneck modules are proposed to enrich the representation power of the model and extract detailed feature hierarchy from different levels. Second, multi-scale spatial regularization with facial geometric information is employed to guide the feature extraction towards emotional facial representations and prevent the model from overfitting extraneous disturbing factors. Third, compact intra-class variation is adopted to separate the normal class from alien classes in the decision space. Extensive evaluations on 4 typical FER datasets from both laboratory and wild scenarios show that our method consistently outperforms state-of-the-art One-Class Classification (OCC) approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.