This paper proposes a new pose and focal length estimation method using two vanishing points and a known camera position. A vanishing point can determine the unit direction vector of the corresponding parallel lines in the camera frame, and as input, the unit direction vector of the corresponding parallel lines in the world frame is also known. Hence, the two units of direction vectors in camera and world frames, respectively, can be transformed into each other only through the rotation matrix that contains all the information of the camera pose. Then, two transformations can be obtained because there are two vanishing points. The two transformations of the unit direction vectors can be regarded as transformations of 3D points whose coordinates are the values of the corresponding unit direction vectors. The key point in this paper is that our problem with vanishing points is converted to rigid body transformation with 3D–3D point correspondences, which is the usual form in the PnP (perspective-n-point) problem. Additionally, this point simplifies our problem of pose estimation. In addition, in the camera frame, the camera position and two vanishing points can form two lines, respectively, and the angle between the two lines is equal to the angle between the corresponding two sets of parallel lines in the world frame. When using this geometric constraint, the focal length can be estimated quickly. The solutions of pose and focal length are both unique. The experiments show that our proposed method has good performances in numerical stability, noise sensitivity and computational speed with synthetic data and real scenarios and also has strong robustness to camera position noise.
Grasp detection takes on a critical significance for the robot. However, detecting object positions and corresponding grasp positions in a stacked environment can be quite difficult for a robot. Based on this practical problem, in order to achieve more accurate object position detection and grasp position detection, a new method called MMD (Multi-stage network for multi-object grasp detection algorithm) is proposed in this paper. MMD covers two parts, including the feature extractor and the multi-stage object predictor. The feature extractor refers to a deep convolutional neural network that can generate shared feature layers as well as the initial ROIs (region of interest). A multi-stage refiner serves as the multi-stage object predictor, which continuously regresses the initial ROI to obtain more accurate object detection and grasping detection results. Ablation experiments show that the proposed MMD has better grasp detection performance. The specific performance is that the recognition precision achieves a state-of-the-art 76.71% mAPg on the VMRD dataset. Moreover, test experiments demonstrate the feasibility of our method on the Kinova robot.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.