Astract: The estimation of the projective structure of a scene from image correspondences can be formulated as the minimization of the mean-squared distance between predicted and observed image points with respect to the projection matrices, the scene point positions, and their depths. Since these unknowns are not independent, constraints must be chosen to ensure that the optimization process is well posed. This paper examines three plausible choices, and shows that the first one leads to the Sturm-Triggs projective factorization algorithm, while the other two lead to new provably-convergent approaches. Experiments with synthetic and real data are used to compare the proposed techniques to the Sturm-Triggs algorithm and bundle adjustment.
IntroductionLet us consider n fixed points P I , .., P,, observed by m perspective cameras. Given some fixed world coordinate system, we can write for i = 1, .., m and j = 1, ..,n, where p i j = (tiij,vij, l)T and zij denote respectively the (homogeneous) coordinate vector of the projection of Pj into image number i, expressed in the corresponding camera's coordinate frame and the depth of Pj relative to that frame, Mi is the 3 x 4 projection matrix associated with this camera in the world coordinate frame, and Pj is the homogeneous coordinate vector of the point Pj in that frame.We address the problem of reconstructing both the matrices Mi (i = l , ..,m) and the vectors Pj ( j = l, ..,TI) from the image correspondences p . . . Of course, zij is also unknown, but its value is not independent of the values of M i and Pi: indeed zij = mi3 . Pj , where m$ denotes the third row of the matrix M i . Faugeras [3] and Hartley et al. [7] have shown that when the internal parameters of the cameras are unknown, the cam-*3 'This work waa done while Y. Omori was visiting the Beckman Institute. He is now with the era motion and the scene structure can only be reconstructed up t o an arbitrary projective transformation