Yinqiang ZHENG†a) , Nonmember, Shigeki SUGIMOTO †b) , and Masatoshi OKUTOMI †c) , Members
SUMMARYWe propose an accurate and scalable solution to the perspective-n-point problem, referred to as ASPnP. Our main idea is to estimate the orientation and position parameters by directly minimizing a properly defined algebraic error. By using a novel quaternion representation of the rotation, our solution is immune to any parametrization degeneracy. To obtain the global optimum, we use the Gröbner basis technique to solve the polynomial system derived from the first-order optimality condition. The main advantages of our proposed solution lie in accuracy and scalability. Extensive experiment results, with both synthetic and real data, demonstrate that our proposed solution has better accuracy than the stateof-the-art noniterative solutions. More importantly, by exploiting vectorization operations, the computational cost of our ASPnP solution is almost constant, independent of the number of point correspondences n in the wide range from 4 to 1000. In our experiment settings, the ASPnP solution takes about 4 milliseconds, thus best suited for real-time applications with a drastically varying number of 3D-to-2D point correspondences * . key words: pose estimation, perspective-n-point, Gröbner basis, global optimum
IntroductionThe perspective-n-point problem (PnP) aims to estimate the orientation and position, or rotation and translation, of a calibrated perspective camera by using n known 3D reference points and their corresponding 2D image projections. It has widespread applications in robot localization [1], hand-eye calibration [2], augmented reality [3] and so on. Considering these various application scenarios, a desirable PnP solution should be accurate, efficient and generally applicable, i.e., capable of handling both planar and non-planar cases with either a few or even hundreds of 3D-to-2D correspondences.
Related WorksAs the minimal case, P3P (n = 3) has been thoroughly investigated in the literature [4], [5]. However, it has at most four possible solutions, and this multiplicity makes a P3P solver very sensitive to noise. In practice, it is usually used together with some robust estimation methods, like RANSAC [6], to remove outliers.To improve robustness to noise, cases of four and more than four correspondences should be considered. There are some specialized algorithms [7] restricted to the slightly redundant n = 4 (P4P) or n = 5 (P5P) cases only. However, the application of these specialized P4P and P5P solutions is limited, since the number of point correspondences might differ from one frame to the other, even in a specific application scenario. As a result, a large body of existing works have tried to improve flexibility, thus applicable to the general n ≥ 4 cases. Quan and Lan [8] proposed a linear solution by combining all the constraints from three-point subsets, whose computational complexity is O(n 5 ). Ansar and Daniilidis [9] developed linear solutions, but with complexity O(n 8 ), for general PnP ...