Multi-person pose estimation has been gaining considerable interest due to its use in several real-world applications, such as activity recognition, motion capture, and augmented reality. Although the improvement of the accuracy and speed of multi-person pose estimation techniques has been recently studied, limitations still exist in balancing these two aspects. In this paper, a novel knowledge distilled lightweight top-down pose network (KDLPN) is proposed that balances computational complexity and accuracy. For the first time in multi-person pose estimation, a network that reduces computational complexity by applying a “Pelee” structure and shuffles pixels in the dense upsampling convolution layer to reduce the number of channels is presented. Furthermore, to prevent performance degradation because of the reduced computational complexity, knowledge distillation is applied to establish the pose estimation network as a teacher network. The method performance is evaluated on the MSCOCO dataset. Experimental results demonstrate that our KDLPN network significantly reduces 95% of the parameters required by state-of-the-art methods with minimal performance degradation. Moreover, our method is compared with other pose estimation methods to substantiate the importance of computational complexity reduction and its effectiveness.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.