Human pose and shape estimation have developed rapidly, where a skinned multi-person linear (SMPL) approach performs excellent recently. However, the prior template of the human body in the SMPL model is fixed, thus a deviation may be resulted in the reconstructed body shape if a human body acts sharp movements such as sporting or dancing. To address this problem, we propose a parallel-branch network including a designed spatial-temporal (ST) branch and a SMPL branch. The ST branch essentially performs the 2D-to-3D lifting for more accurate joint prediction, by the designed spatial transformer and temporal transformer. The 3D joints from the ST branch are used to supervise the 3D joints from the SMPL branch and further correct the deviation of the SMPL model. Experiments on some popular benchmarks like 3DPW and MPI-INF-3DHP show that our method has better performance than other methods with video input. Our code is available at https://automation.seu.edu.cn/ wcx/list.htm
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.