“…In the context of entire human bodies, many of the approaches formulate this task as an image-to-image mapping problem. Given an appearance, these methods maps the body pose in the form of renderings of a skeleton [Chan et al 2019;Kappel et al 2020;Kratzwald et al 2017;Li et al 2019;Pumarola et al 2018;Shysheya et al 2019;Siarohin et al 2018;Zhu et al 2019], projection of a dense human model [Grigor'ev et al 2019;Liu et al 2020bLiu et al , 2019bNeverova et al 2018;Prokudin et al 2021;Raj et al 2021a;Sarkar et al 2020;, or joint position heatmaps [Aberman et al 2019;Ma et al 2017Ma et al , 2018 to realistic human images. To better preserve the appearance from the reference image to the generated image, some methods [Liu et al 2020b;Sarkar et al 2020] first map the person's appearance from screen space to UV space and feed the rendering of the person in the target pose with the UV texture map into an image-to-image translation network.…”