Abstract. The plane-based calibration consists in recovering the internal parameters of the camera from the views of a planar pattern with a known geometric structure. The existing direct algorithms use a problem formulation based on the properties of basis vectors. They minimize algebraic distances and may require a 'good' choice of system normalization. Our contribution is to put this problem into a more intuitive geometric framework. A solution can be obtained by intersecting circles, called Centre Circles, whose parameters are computed from the world-to-image homographies. The Centre Circle is the camera centre locus when planar figures are in perpective correspondence, in accordance with a Poncelet's theorem. An interesting aspect of our formulation, using the Centre Circle constraint, is that we can easily transform the cost function into a sum of squared Euclidean distances. The simulations on synthetic data and an application with real images confirm the strong points of our method.
The computer vision community is currently focusing on solving action recognition problems in real videos, which contain thousands of samples with many challenges. In this process, Deep Convolutional Neural Networks (D-CNNs) have played a significant role in advancing the state-of-the-art in various vision-based action recognition systems. Recently, the introduction of residual connections in conjunction with a more traditional CNN model in a single architecture called Residual Network (ResNet) has shown impressive performance and great potential for image recognition tasks. In this paper, we investigate and apply deep ResNets for human action recognition using skeletal data provided by depth sensors. Firstly, the 3D coordinates of the human body joints carried in skeleton sequences are transformed into image-based representations and stored as RGB images. These color images are able to capture the spatial-temporal evolutions of 3D motions from skeleton sequences and can be efficiently learned by D-CNNs. We then propose a novel deep learning architecture based on ResNets to learn features from obtained color-based representations and classify them into action classes. The proposed method is evaluated on three challenging benchmark datasets including MSR Action 3D, KARD, and NTU-RGB+D datasets. Experimental results demonstrate that our method achieves state-of-the-art performance for all these benchmarks whilst requiring less computation resource. In particular, the proposed method surpasses previous approaches by a significant margin of 3.4% on MSR Action 3D dataset, 0.67% on KARD dataset, and 2.5% on NTU-RGB+D dataset. (Huy-Hieu Pham) challenging task due to many obstacles such as viewpoint, occlusion or lighting conditions (Poppe, 2010).Traditional studies on HAR mainly focus on the use of handcrafted local features such as Cuboids (Dollár et al., 2005) or HOG/HOF (Laptev et al., 2008) that are provided by 2D cameras. These approaches typically recognize human actions based on the appearance and movements of human body parts in videos. Another approach is to use Genetic Programming (GP) for generating spatio-temporal descriptors of motions . However, one of the major limitations of the 2D data is the absence of 3D structure from the scene. There-arXiv:1803.07781v1 [cs.CV]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.