Multiperson pose estimation is an important and complex problem in computer vision. It is regarded as the problem of human skeleton joint detection and solved by the joint heat map regression network in recent years. The key of achieving accurate pose estimation is to learn robust and discriminative feature maps. Although the current methods have made significant progress through interlayer fusion and intralevel fusion of feature maps, few works pay attention to the combination of the two methods. In this paper, we propose a multistage polymerization network (MPN) for multiperson pose estimation. The MPN continuously learns rich underlying spatial information by fusing features within the layers. The MPN also adds hierarchical connections between feature maps at the same resolution for interlayer fusion, so as to reuse low-level spatial information and refine high-level semantic information to obtain accurate keypoint representation. In addition, we observe a lack of connection between the output low-level information and the high-level information. To solve this problem, an effective shuffled attention mechanism (SAM) is proposed. The shuffle aims to promote the cross-channel information exchange between pyramid feature maps, while attention makes a trade-off between the low-level and high-level representations of the output features. As a result, the relationship between the space and the channel of the feature map is further enhanced. Evaluation of the proposed method is carried out on public datasets, and experimental results show that our method has better performance than current methods.
Image matching, a fundamental computer vision method, serves as a crucial pillar for more complex vision applications. The general adoption of feature-based image registration technologies has been accelerated by advances in computing hardware and vision theory. As the current research in this field is not very sufficient, this paper gives an overview of the relevant aspects. At the beginning, this article first introduces the research background, the research achievements and the application in different fields of image feature detection and matching. The main body discusses the most current advancements in this subject, including feature points, local features, global features, matching, and optimization, after examining the classical detection algorithms from recent decades and referencing the most recent machine learning algorithm headed by depth learning, and shows the advantages and disadvantages of the algorithms. Finally, the paper summarizes and prospects the full text.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.