Head2Head: Video-based Neural Head Synthesis

Koujan, Mohammad Rami; Doukas, Michail Christos; Roussos, Anastasios; Zafeiriou, Stefanos

doi:10.1109/fg47880.2020.00048

Cited by 52 publications

(25 citation statements)

References 44 publications

(81 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Video-based 3D reconstruction. The video fitting approach followed in Head2Head [15] to estimate the 3D facial geometry is based on a set of sparse landmarks extracted from the entire input sequence. This method has three main drawbacks: 1) the fidelity of the 3D reconstruction relies heavily on the accuracy of extracted landmarks which are also sparse (68 in total), 2) it might require a large number of frames with enough reconstruction cues (various rotations) to produce good accuracy, 3) it makes a quite strong assumption in the initialisation stage about the rigidity of the face to estimate the camera parameters.…”

Section: D Facial Recoverymentioning

confidence: 99%

“…We use the dense 3D vertices (∼5K) estimated by our trained DenseFaceReg on each video frame to: 1) estimate the camera parameters, 2) generate the 3DMM identity and expression coefficients by projecting the dense shape onto the 3DMM bases. For all our experiments in this work, we use the same 3DMMs utilised in [15]. The analysis-by-synthesis approach, which is used by many state-of-the-art approaches [3], [4], estimates a lot of parameters (e.g.…”

Section: D Facial Recoverymentioning

confidence: 99%

“…The Generator is applied sequentially, producing the output frames one after the other, until the entire output sequence has been created. Similar to Head2Head [15], the Generator consists of two identical encoders, operating in parallel, as well as a decoder. The first encoder receives the concatenated NMFC and eye images X t−2:t , while the second is given the two previously generated frames Ỹt−2:t−1 .…”

Section: Deep Video Rendering Neural Networkmentioning

confidence: 99%

“…Please refer to the Supplementary Material for more details on the dataset. We base our Generator's architecture on our previous work [15]. All discriminators have the same architecture, adopted by pip2pixHD…”

Section: Dataset and Implementation Detailsmentioning

confidence: 99%

“…Our recently proposed Head2Head [15] model overcomes the aforementioned limitations, as it combines the benefits of conditioning synthesis on 3D facial shapes with the advantages of a sequential, video-based, neural renderer. In this paper, we extend the work of Head2Head in the following directions:…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations