Heterogeneous Face Attribute Estimation: A Deep Multi-Task Learning Approach

Han, Hu; Jain, Anil K.; Wang, Fang; Shan, Shiguang; Chen, Xilin

doi:10.1109/tpami.2017.2738004

Cited by 228 publications

(160 citation statements)

References 50 publications

Supporting

Mentioning

154

Contrasting

Order By: Relevance

“…Finally, we note that our goal is not to develop a state of the art facial attribute classification scheme. Nevertheless, results obtained by training an lSVM on embeddings transferred from a face recognition network are only 2.4% lower than the best scores reported by DMTL 2018 [22] (last column of Table 1). The effort involved in developing a state of the art face recognition network can be substantial.…”

Section: Results Inmentioning

confidence: 65%

Transferability and Hardness of Supervised Classification Tasks

Tran

Nguyen

Hassner

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

We propose a novel approach for estimating the difficulty and transferability of supervised classification tasks. Unlike previous work, our approach is solution agnostic and does not require or assume trained models. Instead, we estimate these values using an information theoretic approach: treating training labels as random variables and exploring their statistics. When transferring from a source to a target task, we consider the conditional entropy between two such variables (i.e., label assignments of the two tasks). We show analytically and empirically that this value is related to the loss of the transferred model. We further show how to use this value to estimate task hardness. We test our claims extensively on three large scale data sets-CelebA (40 tasks), Animals with Attributes 2 (85 tasks), and Caltech-UCSD Birds 200 (312 tasks)-together representing 437 classification tasks. We provide results showing that our hardness and transferability estimates are strongly correlated with empirical hardness and transferability. As a case study, we transfer a learned face recognition model to CelebA attribute classification tasks, showing state of the art accuracy for tasks estimated to be highly transferable.

show abstract

Section: Results Inmentioning

confidence: 65%

Transferability and Hardness of Supervised Classification Tasks

Tran

Nguyen

Hassner

2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

show abstract

“…Following [24], Chen et al [3] utilized ranking-CNN for age estimation, in which there were a series of basic binary CNNs, aggregating to the final estimation. Han et al [9] used multiple attributes for multi-task learning. Gao et al [6] used KL divergence to measure the similarity between the estimated and groundtruth distributions for age.…”

Section: Related Workmentioning

confidence: 99%

C3AE: Exploring the Limits of Compact Model for Age Estimation

Zhang

Liu

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Age estimation is a classic learning problem in computer vision. Many larger and deeper CNNs have been proposed with promising performance, such as AlexNet, Vg-gNet, GoogLeNet and ResNet. However, these models are not practical for the embedded/mobile devices. Recently, MobileNets and ShuffleNets have been proposed to reduce the number of parameters, yielding lightweight models. However, their representation has been weakened because of the adoption of depth-wise separable convolution. In this work, we investigate the limits of compact model for smallscale image and propose an extremely Compact yet efficient Cascade Context-based Age Estimation model(C3AE). This model possesses only 1/9 and 1/2000 parameters compared with MobileNets/ShuffleNets and VggNet, while achieves competitive performance. In particular, we redefine age estimation problem by two-points representation, which is implemented by a cascade model. Moreover, to fully utilize the facial context information, multibranch CNN network is proposed to aggregate multi-scale context. Experiments are carried out on three age estimation datasets. The state-of-the-art performance on compact model has been achieved with a relatively large margin.

show abstract

“…Registration-based face analysis. Despite significant advances in deep learning, automatic face analysis tasks, such as smile detection (a comparative review is provided in Table 3), attribute prediction [12,11,43] and valence-arousal estimation [32], still face major challenges caused by occlusions and variances of head pose, scale, and illumination. These challenges are the main reason why every state-of-the-art approach to face analysis requires a pre-normalisation step involving face detection and registration (rotation, scaling, and 2D/3D transformation).…”

Section: Related Workmentioning

confidence: 99%

“…Several recent approaches address the problem of facial attributes prediction [11,43]. Some propose to use successful face-specific feature representations [64], modelling class distributions [8] and balancing attributes [37], indirect guiding the categorisation of similar features [52,43], or direct grouping the relevant attributes [12,11]. The best performances (more than 90% accuracy) are obtained by specifically designing a model structure that utilises the relations between relevant attributes [12,11].…”

Section: Related Workmentioning

confidence: 99%

Registration-free Face-SSD: Single shot analysis of smiles, facial attributes, and affect in the wild

Jang

Güneş

Patras

2019

Computer Vision and Image Understanding

View full text Add to dashboard Cite

In this paper, we present a novel single shot face-related task analysis method, called Face-SSD, for detecting faces and for performing various face-related (classification / regression) tasks including smile recognition, face attribute prediction and valence-arousal estimation in the wild. Face-SSD uses a Fully Convolutional Neural Network (FCNN) to detect multiple faces of different sizes and recognise / regress one or more face-related classes. Face-SSD has two parallel branches that share the same low-level filters, one branch dealing with face detection and the other one with face analysis tasks. The outputs of both branches are spatially aligned heatmaps that are produced in paralleltherefore Face-SSD does not require that face detection, facial region extraction, size normalisation, and facial region processing are performed in subsequent steps. Our contributions are threefold: 1) Face-SSD is the first network to perform face analysis without relying on pre-processing such as face detection and registration in advance -Face-SSD is a simple and a single FCNN architecture simultaneously performing face detection and face-related task analysis -those are conventionally treated as separate consecutive tasks; 2) Face-SSD is a generalised architecture that is applicable for various face analysis tasks without modifying the network structure -this is in contrast to designing task-specific architectures; and 3) Face-SSD achieves real-time performance (21 FPS) even when detecting multiple faces and recognising multiple classes in a given image (300 × 300). Experimental results show that Face-SSD achieves state-of-the-art performance in various face analysis tasks by reaching a recognition accuracy of 95.76% for smile detection, 90.29% for attribute prediction, and Root Mean Square (RMS) error of 0.44 and 0.39 for valence and arousal estimation. (Youngkyoon Jang) recent studies design specific architectures for each individual face analysis task. Although some works propose unified frameworks for handling multiple face-related tasks [56,3,35], several open issues remain yet to be explored:• Unconstrained conditions: Most of the existing approaches require a detected and normalised face input.• Scalability: Most methods design separate networks for different tasks. However, networks that are specifically designed to maximise the performance for certain tasks cannot be easily adapted to do other types of face analysis tasks.• Real-time performance: Existing methods do not achieve real-time performance because they require time-

show abstract

Heterogeneous Face Attribute Estimation: A Deep Multi-Task Learning Approach

Cited by 228 publications

References 50 publications

Transferability and Hardness of Supervised Classification Tasks

Transferability and Hardness of Supervised Classification Tasks

C3AE: Exploring the Limits of Compact Model for Age Estimation

Registration-free Face-SSD: Single shot analysis of smiles, facial attributes, and affect in the wild

Contact Info

Product

Resources

About