We propose a noniterative solution for the Perspective-n-Point ({\rm P}n{\rm P}) problem, which can robustly retrieve the optimum by solving a seventh order polynomial. The central idea consists of three steps: 1) to divide the reference points into 3-point subsets in order to achieve a series of fourth order polynomials, 2) to compute the sum of the square of the polynomials so as to form a cost function, and 3) to find the roots of the derivative of the cost function in order to determine the optimum. The advantages of the proposed method are as follows: First, it can stably deal with the planar case, ordinary 3D case, and quasi-singular case, and it is as accurate as the state-of-the-art iterative algorithms with much less computational time. Second, it is the first noniterative {\rm P}n{\rm P} solution that can achieve more accurate results than the iterative algorithms when no redundant reference points can be used (n\le 5). Third, large-size point sets can be handled efficiently because its computational complexity is O(n).
Pose estimation, tracking, and action recognition of articulated objects from depth images are important and challenging problems, which are normally considered separately. In this paper, a unified paradigm based on Lie group theory is proposed, which enables us to collectively address these related problems. Our approach is also applicable to a wide range of articulated objects. Empirically it is evaluated on lab animals including mouse and fish, as well as on human hand. On these applications, it is shown to deliver competitive results compared to the state-of-the-arts, and non-trivial baselines including convolutional neural networks and regression forest methods. Moreover, new sets of annotated depth data of articulated objects are created which, together with our code, are made publicly available.
In this paper we deal with the camera pose estimation problem from a set of 2D/3D line correspondences, which is also known as PnL (Perspective-n-Line) problem. We carry out our study by comparing PnL with the well-studied PnP (Perspective-n-Point) problem, and our contributions are three-fold: (1) We provide a complete 3D configuration analysis for P3L, which includes the well-known P3P problem as well as several existing analyses as special cases. (2) By exploring the similarity between PnL and PnP, we propose a new subset-based PnL approach as well as a series of linear-formulation-based PnL approaches inspired by their PnP counterparts. (3) The proposed linear-formulation-based methods can be easily extended to deal with the line and point features simultaneously.
This paper aims to tackle the practically very challenging problem of efficient and accurate hand pose estimation from single depth images. A dedicated two-step regression forest pipeline is proposed: given an input hand depth image, step one involves mainly estimation of 3D location and in-plane rotation of the hand using a pixelwise regression forest. This is utilized in step two which delivers final hand estimation by a similar regression forest model based on the entire hand image patch. Moreover, our estimation is guided by internally executing a 3D hand kinematic chain model. For an unseen test image, the kinematic model parameters are estimated by a proposed dynamically weighted scheme. As a combined effect of these proposed building blocks, our approach is able to deliver more precise estimation of hand poses. In practice, our approach works at 15.6 frame-per-second (FPS) on an average laptop when implemented in CPU, which is further sped-up to 67.2 FPS when running on GPU. In addition, we introduce and make publicly available a data-glove annotated depth image dataset covering various hand shapes and gestures, which enables us conducting quantitative analyses on real-world hand images. The effectiveness of our approach is verified empirically on both synthetic and the annotated real-world datasets for hand pose estimation, as well as related applications including part-based labeling and gesture classification. In addition to empirical studies, the consistency property of our approach is also theoretically analyzed.
The perspective-three-point problem (P3P) is a classical problem in computer vision. The existing direct solutions of P3P have at least three limitations: (1) the numerical instability when using different vertex permutations, (2) the degeneration in the geometric singularity case, and (3) the dependence on particular equation solvers. A new direct solution of P3P is presented to deal with these limitations. The main idea is to reduce the number of unknown parameters by using a geometric constraint we called "perspective similar triangle" (PST). The PST method achieves high stability in the permutation problem and in the presence of image noise, and does not rely on particular equation solvers. Furthermore, reliable results can be retrieved even in "danger cylinder", a typical kind of geometric singularity of P3P, where all existing direct solutions degenerate significantly.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.