2015 IEEE International Conference on Computer Vision Workshop (ICCVW) 2015
DOI: 10.1109/iccvw.2015.55
|View full text |Cite
|
Sign up to set email alerts
|

An End-to-End System for Unconstrained Face Verification with Deep Convolutional Neural Networks

Abstract: Over the last five years, methods based on Deep Convolutional Neural Networks (DCNNs) have shown impressive performance improvements for object detection and recognition problems. This has been made possible due to the availability of large annotated datasets, a better understanding of the non-linear mapping between input images and class labels as well as the affordability of GPUs. In this paper, we present the design details of a deep learning system for unconstrained face recognition, including modules for … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
51
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 76 publications
(51 citation statements)
references
References 28 publications
0
51
0
Order By: Relevance
“…Note that, we obtain 99.7% on VR@FAR=0.1% using the 6K pair-matching scores of the standard protocol. 6 Results computed from the features publicly provided by the authors. [24] N N 0.805 0.604 Face-Aug-Pose-Syn [23] N N 0.886 0.725 Deep Multipose [1] N N 0.787 -Pose aware FR [22] N N 0.826 0.652 TPE [28] N N 0.871 0.766 All-In-One [25] N N 0.893 0.787…”
Section: Results and Evaluationmentioning
confidence: 99%
“…Note that, we obtain 99.7% on VR@FAR=0.1% using the 6K pair-matching scores of the standard protocol. 6 Results computed from the features publicly provided by the authors. [24] N N 0.805 0.604 Face-Aug-Pose-Syn [23] N N 0.886 0.725 Deep Multipose [1] N N 0.787 -Pose aware FR [22] N N 0.826 0.652 TPE [28] N N 0.871 0.766 All-In-One [25] N N 0.893 0.787…”
Section: Results and Evaluationmentioning
confidence: 99%
“…For example, [24] aggregates local descriptors (RootSIFT [3]) extracted from face crops using Fisher Vector [26] (FV) encoding to obtain a single descriptor per face track. Since the success of deep learning in image-based face recognition [4,23,25,29,30,31,44], simple strategies for face descriptor aggregation prevailed, such as average-and max-pooling [7,25]. However, none of these strategies are trained end-to-end for face recognition as typically only the face descriptors are learnt, while aggregation is performed post hoc.…”
Section: Related Workmentioning
confidence: 99%
“…template descriptors of the same subject should be close to each other in the descriptor space, whereas those of different subjects should be far apart. Although common aggregation strategies, such as average pooling and max-pooling, are able to aggregate face descriptors to produce a compact template representation [5,7,25] and currently achieves the state-of-the-art results [5], we seek a better solution in this paper.…”
Section: Introductionmentioning
confidence: 99%
“…Interestingly, the variance on the similarity scores does not decrease as template sizes increase, rather they stay largely the same even as the mean similarity increases. Figure 5 (right) shows the effect of template size on ver- (top) Template adaptation compared with CNN encoding with metric learning using triplet similarity embedding [4,6] or Joint Bayesian embedding [21,23]. (bottom) Template adaptation compared with CNN encoding and 2D alignment [5,4].…”
Section: Negative Set Studymentioning
confidence: 99%