Abstract:This paper is aimed at creating extremely small and fast convolutional neural networks (CNN) for the problem of facial expression recognition (FER) from frontal face images. To this end, we employed the popular knowledge distillation (KD) method and identified two major shortcomings with its use: 1) a fine-grained grid search is needed for tuning the temperature hyperparameter and 2) to find the optimal size-accuracy balance, one needs to search for the final network size (i.e. the compression rate). On the ot… Show more
“…Facial Expression. The proposed system uses MicroExpNet [14] to extract the facial expressions of the coachees. This is a small and fast convolutional neural network designed for facial expression recognition, which is obtained by distilling a heavy and accurate neural network.…”
Section: Multimodal Inputmentioning
confidence: 99%
“…This is a small and fast convolutional neural network designed for facial expression recognition, which is obtained by distilling a heavy and accurate neural network. Çugu et al [14] reported that the network achieved over 95.0% classification accuracy for the eight expressions of "neutral, " "anger, " "contempt, " "disgust, " "fear, " "happy, " "sadness, " and "surprise" under the real-time conditions. Therefore, we decided to use this network to meet requirement (2).…”
Executive coaching has been drawing more and more attention for developing corporate managers. While conversing with managers, coach practitioners are also required to understand internal states of coachees through objective observations. In this paper, we present REsCUE, an automated system to aid coach practitioners in detecting unconscious behaviors of their clients. Using an unsupervised anomaly detection algorithm applied to multimodal behavior data such as the subject's posture and gaze, REsCUE notifies behavioral cues for coaches via intuitive and interpretive feedback in real-time. Our evaluation with actual coaching scenes confirms that REsCUE provides the informative cues to understand internal states of coachees. Since REsCUE is based on the unsupervised method and does not assume any prior knowledge, further applications beside executive coaching are conceivable using our framework.
CCS CONCEPTS• Human-centered computing → Computer supported cooperative work; HCI design and evaluation methods; • Information systems → Multimedia and multimodal retrieval; * These authors contributed equally and are ordered alphabetically † Also with University of Tsukuba, Japan. Figure 1: REsCUE detects the behavioral cues of the coachee and notifies the coach in real-time to help the coach understand the internal states of the coachee.
“…Facial Expression. The proposed system uses MicroExpNet [14] to extract the facial expressions of the coachees. This is a small and fast convolutional neural network designed for facial expression recognition, which is obtained by distilling a heavy and accurate neural network.…”
Section: Multimodal Inputmentioning
confidence: 99%
“…This is a small and fast convolutional neural network designed for facial expression recognition, which is obtained by distilling a heavy and accurate neural network. Çugu et al [14] reported that the network achieved over 95.0% classification accuracy for the eight expressions of "neutral, " "anger, " "contempt, " "disgust, " "fear, " "happy, " "sadness, " and "surprise" under the real-time conditions. Therefore, we decided to use this network to meet requirement (2).…”
Executive coaching has been drawing more and more attention for developing corporate managers. While conversing with managers, coach practitioners are also required to understand internal states of coachees through objective observations. In this paper, we present REsCUE, an automated system to aid coach practitioners in detecting unconscious behaviors of their clients. Using an unsupervised anomaly detection algorithm applied to multimodal behavior data such as the subject's posture and gaze, REsCUE notifies behavioral cues for coaches via intuitive and interpretive feedback in real-time. Our evaluation with actual coaching scenes confirms that REsCUE provides the informative cues to understand internal states of coachees. Since REsCUE is based on the unsupervised method and does not assume any prior knowledge, further applications beside executive coaching are conceivable using our framework.
CCS CONCEPTS• Human-centered computing → Computer supported cooperative work; HCI design and evaluation methods; • Information systems → Multimedia and multimodal retrieval; * These authors contributed equally and are ordered alphabetically † Also with University of Tsukuba, Japan. Figure 1: REsCUE detects the behavioral cues of the coachee and notifies the coach in real-time to help the coach understand the internal states of the coachee.
“…Our SD-CNN method is compared with the recent state-of-the-art FER methods, including DeRL [ 12 ], FN2EN [ 18 ], FMPN [ 28 ], VGG-face [ 8 ], MicroExpNet [ 34 ], GoogLeNet [ 17 ], MultiAttention [ 24 ], DSAE [ 26 ], GCNet [ 40 ], DynamicMTL [ 41 ], IA-gen [ 20 ], CompactCNN [ 30 ], DTAGN(Joint) [ 29 ], CPPN [ 27 ], DPND [ 10 ], PPDN [ 11 ], and FAN [ 31 ]. Table 2 reports our experimental results and shows the comparisons with these methods.…”
Section: Resultsmentioning
confidence: 99%
“…Our SD-CNN is also compared with the recent state-of-the-art methods, including FN2EN [ 18 ], DeRL [ 12 ], GoogLeNet (fine-tuned) [ 17 ], VGG-face (fine-tuned) [ 8 ], GCNet [ 40 ], DynamicMTL [ 41 ], MicroExpNet [ 34 ], MultiAttention [ 24 ], IA-gen [ 20 ], PPDN [ 11 ], DPND [ 10 ], DTAGN(Joint) [ 29 ], and CompactCNN [ 30 ], on the Oulu-CASIA dataset. As shown in Table 3 , our method achieves the highest accuracy of 91.3 %, which outperforms the state-of-the-art static image-based method (i.e., the DynamicMTL [ 41 ]) by 1.7% and also suppresses the state-of-the-art sequence-based method (i.e., the CompactCNN [ 30 ]) by 2.7%.…”
Section: Resultsmentioning
confidence: 99%
“…According to the Facial Action Coding System (FACS) [ 33 ], each of the six facial expressions can be represented by a vector of empirical action unit intensities that: . In our implementation, empirical action unit intensities are calculated from the EmotioNet dataset [ 34 ]. To make the generated facial expressions more diverse, we sample an action unit vector for each subject from the empirical action unit intensities using the Gaussian distribution.…”
Facial expression recognition (FER) is a challenging problem due to the intra-class variation caused by subject identities. In this paper, a self-difference convolutional network (SD-CNN) is proposed to address the intra-class variation issue in FER. First, the SD-CNN uses a conditional generative adversarial network to generate the six typical facial expressions for the same subject in the testing image. Second, six compact and light-weighted difference-based CNNs, called DiffNets, are designed for classifying facial expressions. Each DiffNet extracts a pair of deep features from the testing image and one of the six synthesized expression images, and compares the difference between the deep feature pair. In this way, any potential facial expression in the testing image has an opportunity to be compared with the synthesized “Self”—an image of the same subject with the same facial expression as the testing image. As most of the self-difference features of the images with the same facial expression gather tightly in the feature space, the intra-class variation issue is significantly alleviated. The proposed SD-CNN is extensively evaluated on two widely-used facial expression datasets: CK+ and Oulu-CASIA. Experimental results demonstrate that the SD-CNN achieves state-of-the-art performance with accuracies of 99.7% on CK+ and 91.3% on Oulu-CASIA, respectively. Moreover, the model size of the online processing part of the SD-CNN is only 9.54 MB (1.59 MB ×6), which enables the SD-CNN to run on low-cost hardware.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.