2017 IEEE International Conference on Computer Vision (ICCV) 2017
DOI: 10.1109/iccv.2017.180
|View full text |Cite
|
Sign up to set email alerts
|

Reconstruction-Based Disentanglement for Pose-Invariant Face Recognition

Abstract: Deep neural networks (DNNs) trained on large-scale datasets have recently achieved impressive improvements in face recognition. But a persistent challenge remains to develop methods capable of handling large pose variations that are relatively under-represented in training data. This paper presents a method for learning a feature representation that is invariant to pose, without requiring extensive pose coverage in training data. We first propose to generate non-frontal views from a single frontal face, in ord… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
92
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
3
3
3

Relationship

0
9

Authors

Journals

citations
Cited by 143 publications
(98 citation statements)
references
References 40 publications
(86 reference statements)
0
92
0
Order By: Relevance
“…Inverse graphics networks [Kulkarni et al 2015], learn an interpretable representation of images by decomposing them into shape, pose and lighting codes. Peng et al [2017] disentangle face appearance from its pose, by learning a pose-invariant feature representation. Ma et al [2018] disentangle and encode background, foreground, and pose from still human images into embedding features, which are then combined to re-compose the input image.…”
Section: Related Workmentioning
confidence: 99%
“…Inverse graphics networks [Kulkarni et al 2015], learn an interpretable representation of images by decomposing them into shape, pose and lighting codes. Peng et al [2017] disentangle face appearance from its pose, by learning a pose-invariant feature representation. Ma et al [2018] disentangle and encode background, foreground, and pose from still human images into embedding features, which are then combined to re-compose the input image.…”
Section: Related Workmentioning
confidence: 99%
“…On the other hand, recent work shows that machine lipreading has got remarkable progress especially with the advancement of deep learning techniques [2]- [6]. Deep learning techniques have been the cornerstone for recent success [7]- [11] of many traditionally hard machine learning tasks [12]- [17]. Generally, deep learning based lipreading methods follow the state-of-the-art sequential modeling solutions that have been widely applied to problems like acoustic based speech recognition [18]- [20] and neural machine translation [21]- [24].…”
Section: Introductionmentioning
confidence: 99%
“…The proposed feature-level fusion method is inspired by the face dis-entangled representation work proposed by Peng et al and Tran et al in [42,58,40], where the encoded feature representations are explicitly disentangled into separate parts representing different facial priors such as identity, pose and gender. Rather than leveraging the supervised label information to enforce the disentangling factor in the embedded features, each encoder structure in the proposed method inherently learns to characterize different geometric and texture information that is captured in the Stokes images.…”
Section: Multi-stream Feature-level Fusion Generatormentioning
confidence: 99%
“…Deep learning methods, enabled by the vast improvements in processing hardware coupled with the ubiquity of face data, have led to significant improvements in face recognition accuracy, particularly in unconstrained face imagery [45], [5], [46]. Even though these methods are able to address many challenges and have even achieved human-expert level performance on challenging databases such as the low-resolution, pose variation and illumination variation to some extent [58], [42], [4], [8], [45], they are specifically designed for recognizing face images that are collected in the visible spectrum. Hence, they often do not perform well on the face im-ages captured from other domains such as thermal [49], [76], [17], [18], infrared [27], [37], [63] or millimeter wave [11], [12] due to significant phenomenological differences as well as a lack of sufficient training data.…”
Section: Introductionmentioning
confidence: 99%