Robust facial landmark detection and tracking across poses and expressions for in-the-wild monocular video

Liu, Shuang; Zhang, Yongqiang; Yang, Xiaosong; Shi, Daming; Zhang, Jianjun

doi:10.1007/s41095-016-0068-y

Cited by 13 publications

(7 citation statements)

References 55 publications

(93 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…2D model‐based methods include popular AAM‐based methods: AAM‐1, and AAM‐2 . A recent 3D robust method is also compared. These methods are also trained on the same three databases.…”

Section: Experiments and Resultsmentioning

confidence: 99%

“…Usually, given L labeled 2D landmarks and corresponding predefined 3D landmarks on its 3D face mesh, the pose matrix { R , t }, identity coefficients u , expression coefficients e , and displacements D = { d k } can be solved by minimizing the Huber loss function applied to re‐projected error between 2D landmarks and 3D landmarks:

a r g \min_{P} \sum_{k = 1}^{L} ∥ d_{k} ∥_{ϵ}^{2}

where P ={ R , t , u , e , D } and d k is computed on the basis of the definition above:

d_{k} = s_{k} - \prod {((), boldR false(C_{r} \times_{2} u^{T} \times_{3} e^{T} false) + boldt)}^{false(v_{k} false)}

In the case that 2D landmarks has been detected, a nonlinear trust region optimization method like a sparse variant of the Levenberg–Marquardt algorithm can be used to solve pose and bilinear parameters, as Shuang et al do. It is obvious that a reliable landmark detector is necessary, but popular detectors are usually 2D‐based and cannot capture large variations of pose and expression out of plane.…”

Section: Supervised Coordinate Descent Methods With a 3d Bilinear Modelmentioning

confidence: 99%

“…Yang et al trained linear PCA model on a public 3D face database offline, and located 2D landmarks by a variant of ASM and solved coefficients of 3D PCA components based on perspective projection when tracking online. Similar to this work, Shuang et al used a popular 2D landmark detector to localize the facial landmarks and solved the pose, expression coefficients and identity coefficients based on perspective projection and a pre‐trained 3D bilinear model. Cao et al proposed a 3D cascaded regression‐based method for user‐specific face tracking and animation.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Supervised coordinate descent method with a 3D bilinear model for face alignment and tracking

Zhang

Liu

Yang

et al. 2017

Computer Animation & Virtual

Self Cite

View full text Add to dashboard Cite

Face alignment and tracking play important roles in facial performance capture. Existing data-driven methods for monocular videos suffer from large variations of pose and expression. In this paper we propose an efficient and robust method for this task by introducing a novel supervised coordinate descent method (SCDM) with 3d bilinear representation. Instead of learning the mapping between the whole parameters and image features directly with a cascaded regression framework in current methods, we learn individual sets of parameters mappings separately step by step by a coordinate descent mean. Since different parameters make different contributions to the displacement of facial landmarks, our method is more discriminative to current whole-parameter cascaded regression methods. Benefiting from a 3D bilinear model learned from public databases, the proposed method can handle the head pose changes and extreme expressions out of plane better than other 2D-based methods. We present the reliable result of face tracking under various head poses and facial expressions on challenging video sequences collected online. The experimental results show our method outperforms state-of-art data-driven methods.

show abstract

Section: Experiments and Resultsmentioning

confidence: 99%

a r g \min_{P} \sum_{k = 1}^{L} ∥ d_{k} ∥_{ϵ}^{2}

where P ={ R , t , u , e , D } and d k is computed on the basis of the definition above:

d_{k} = s_{k} - \prod {((), boldR false(C_{r} \times_{2} u^{T} \times_{3} e^{T} false) + boldt)}^{false(v_{k} false)}

Section: Supervised Coordinate Descent Methods With a 3d Bilinear Modelmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Supervised coordinate descent method with a 3D bilinear model for face alignment and tracking

Zhang

Liu

Yang

et al. 2017

Computer Animation & Virtual

Self Cite

View full text Add to dashboard Cite

show abstract

“…In our experiments the landmark detector in [21] achieved the best trade-off between efficiency and accuracy, but for applications where real-time is not a priority the landmark detector could be swapped by more robust ones such as in [40,41]. To reduce redundancy only representative landmarks are chosen as described in [24] as well. The landmarks in frame i is denoted as S i , and the 3D parametric model from [6] is represented as:…”

Section: Parametric Model Fittingmentioning

confidence: 99%

Realtime Dynamic 3D Facial Reconstruction for Monocular Video In-the-Wild

Liu

Wang

Yang

et al. 2017

2017 IEEE International Conference on Computer Vision Workshops (ICCVW)

Self Cite

View full text Add to dashboard Cite

With the increasing amount of videos recorded using 2D mobile cameras, the technique for recovering the 3D dynamic facial models from these monocular videos has become a necessity for many image and video editing applications. While methods based parametric 3D facial models can reconstruct the 3D shape in dynamic environment, large structural changes are ignored. Structure-frommotion methods can reconstruct these changes but assume the object to be static. To address this problem we present a novel method for realtime dynamic 3D facial tracking and reconstruction from videos captured in uncontrolled environments. Our method can track the deforming facial geometry and reconstruct external objects that protrude from the face such as glasses and hair. It also allows users to move around, perform facial expressions freely without degrading the reconstruction quality.

show abstract

“…Song et al [17] proposed a half-face dictionary integration algorithm for representation-based classification, the strength of this method is that it is able to successfully construct the dual-column (row) half-face training matrix, while quantifying the integrated learning atoms that exert influence on signal reconstruction. The use of virtual face images [18,19] has also been proven beneficial to a number of face analysis tasks such as face recognition [20,21] and facial landmark detection [22,23,24,25,26]. Facial symmetry property has also been widely used to quickly locate the candidate samples in face detection, alignment and classification [27,28,29].…”

mentioning

confidence: 99%

Fast SRC using quadratic optimisation in downsized coefficient solution subspace

Song

Luo

et al. 2019

Signal Processing

View full text Add to dashboard Cite

Robust facial landmark detection and tracking across poses and expressions for in-the-wild monocular video

Cited by 13 publications

References 55 publications

Supervised coordinate descent method with a 3D bilinear model for face alignment and tracking

Supervised coordinate descent method with a 3D bilinear model for face alignment and tracking

Realtime Dynamic 3D Facial Reconstruction for Monocular Video In-the-Wild

Fast SRC using quadratic optimisation in downsized coefficient solution subspace

Contact Info

Product

Resources

About