Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors

Dong, Xuanyi; Yu, Shoou-I; Weng, Xinshuo; Wei, Shih-En; Yang, Yi; Sheikh, Yaser

doi:10.1109/cvpr.2018.00045

Cited by 194 publications

(186 citation statements)

References 37 publications

Supporting

Mentioning

177

Contrasting

Order By: Relevance

“…Similar work [25,26] propagates pose results temporally using optical flow to encourage time consistency of the estimated bodies. Apart from its application in warping between frames, the structural information existing in optical flow alone has been used for pose estimation [27] or in conjunction with an image stream [28,29].…”

Section: Related Workmentioning

confidence: 99%

Learning Multi-human Optical Flow

et al. 2020

View full text Add to dashboard Cite

The optical flow of humans is well known to be useful for the analysis of human action. Recent optical flow methods focus on training deep networks to approach the problem. However, the training data used by them does not cover the domain of human motion. Therefore, we develop a dataset of multi-human optical flow and train optical flow networks on this dataset. We use a 3D model of the human body and motion capture data to synthesize realistic flow fields in both single-and multi-person images. We then train optical flow networks to estimate human flow fields from pairs of images. We demonstrate that our trained networks are more accurate than a wide range of top methods on heldout test data and that they can generalize well to real image sequences. The code, trained models and the dataset are available for research.

show abstract

Section: Related Workmentioning

confidence: 99%

Learning Multi-human Optical Flow

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Linear regression based methods learn a function that maps the input face image to the normalized landmark coordinates [44,7]. Heatmap regression based methods produce one heatmap for each landmark, where the coordinate is the location of the highest response on this heatmap [41,11,9,30,5]. All above algorithms can be readily integrated into our framework, serving as different student detectors.…”

Section: Supervised Facial Landmark Detectionmentioning

confidence: 99%

“…The 300-W dataset [35] annotates 68 landmarks from five facial landmark datasets, i.e., LFPW, AFW, HELEN, XM2VTS, and IBUG. Following the common settings [11,9,27], we regard all the training samples from LFPW, HE-LEN and the full set of AFW as the training set, in which there is 3148 training images. The common test subset consists of 554 test images from LFPW and HELEN.…”

Section: Datasetsmentioning

confidence: 99%

See 1 more Smart Citation

Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection

Dong

Yang

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

Self Cite

View full text Add to dashboard Cite

Facial landmark detection aims to localize the anatomically defined points of human faces. In this paper, we study facial landmark detection from partially labeled facial images. A typical approach is to (1) train a detector on the labeled images; (2) generate new training samples using this detector's prediction as pseudo labels of unlabeled images;(3) retrain the detector on the labeled samples and partial pseudo labeled samples. In this way, the detector can learn from both labeled and unlabeled data to become robust.In this paper, we propose an interaction mechanism between a teacher and two students to generate more reliable pseudo labels for unlabeled data, which are beneficial to semi-supervised facial landmark detection. Specifically, the two students are instantiated as dual detectors. The teacher learns to judge the quality of the pseudo labels generated by the students and filter out unqualified samples before the retraining stage. In this way, the student detectors get feedback from their teacher and are retrained by premium data generated by itself. Since the two students are trained by different samples, a combination of their predictions will be more robust as the final prediction compared to either prediction. Extensive experiments on 300-W and AFLW benchmarks show that the interactions between teacher and students contribute to better utilization of the unlabeled data and achieves state-of-the-art performance.

show abstract

“…More recently, PCD-CNN [9] uses head pose information to drive the training process. CPM+SBR [5] employs landmark registration to regularize training. SAN [4] uses adversarial networks to convert images from different styles to an aggregated style, upon which regression is performed.…”

Section: Related Workmentioning

confidence: 99%

DeCaFA: Deep Convolutional Cascade for Face Alignment in the Wild

Dapogny

Bailly

Cord

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

Face Alignment is an active computer vision domain, that consists in localizing a number of facial landmarks that vary across datasets. State-of-the-art face alignment methods either consist in end-to-end regression, or in refining the shape in a cascaded manner, starting from an initial guess. In this paper, we introduce DeCaFA, an end-to-end deep convolutional cascade architecture for face alignment. DeCaFA uses fully-convolutional stages to keep full spatial resolution throughout the cascade. Between each cascade stage, DeCaFA uses multiple chained transfer layers with spatial softmax to produce landmark-wise attention maps for each of several landmark alignment tasks. Weighted intermediate supervision, as well as efficient feature fusion between the stages allow to learn to progressively refine the attention maps in an end-to-end manner. We show experimentally that DeCaFA significantly outperforms existing approaches on 300W, CelebA and WFLW databases. In addition, we show that DeCaFA can learn fine alignment with reasonable accuracy from very few images using coarsely annotated data.

show abstract

Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors

Cited by 194 publications

References 37 publications

Learning Multi-human Optical Flow

Learning Multi-human Optical Flow

Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection

DeCaFA: Deep Convolutional Cascade for Face Alignment in the Wild

Contact Info

Product

Resources

About