Face recognition performance has improved remarkably in the last decade. Much of this success can be attributed to the development of deep learning techniques such as convolutional neural networks (CNNs). While CNNs have pushed the state-of-the-art forward, their training process requires a large amount of clean and correctly labelled training data. If a CNN is intended to tolerate facial pose, then we face an important question: should this training data be diverse in its pose distribution, or should face images be normalized to a single pose in a pre-processing step? To address this question, we evaluate a number of facial landmarking algorithms and a popular frontalization method to understand their effect on facial recognition performance. Additionally, we introduce a new, automatic, single-image frontalization scheme that exceeds the performance of the reference frontalization algorithm for video-to-video face matching on the Point and Shoot Challenge (PaSC) dataset. Additionally, we investigate failure modes of each frontalization method on different facial yaw using the CMU Multi-PIE dataset. We assert that the subsequent recognition and verification performance serves to quantify the effectiveness of each pose correction scheme.
The Point-and-Shoot Face Recognition Challenge (PaSC) is a performance evaluation challenge including 1401 videos of 265 people acquired with handheld cameras and depicting people engaged in activities with non-frontal head pose. This report summarizes the results from a competition using this challenge problem. In the Video-to-video Experiment a person in a query video is recognized by comparing the query video to a set of target videos. Both target and query videos are drawn from the same pool of 1401 videos. In the Still-to-video Experiment the person in a query video is to be recognized by comparing the query video to a larger target set consisting of still images. Algorithm performance is characterized by verification rate at a false accept rate of 0.01 and associated receiver operating characteristic (ROC) curves. Participants were provided eye coordinates for video frames. Results were submitted by 4 institutions: (i) Advanced Digital Science Center, Singapore; (ii) CPqD, Brasil; (iii) Stevens Institute of Technology, USA; and (iv) University of Ljubljana, Slovenia. Most competitors demonstrated video face recognition performance superior to the baseline provided with PaSC. The results represent the best performance to date on the handheld video portion of the PaSC.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.