This work introduces the Spacecraft Pose Network (SPN) for on-board estimation of the pose, i.e., the relative position and attitude, of a known non-cooperative spacecraft using monocular vision. In contrast to other state-of-the-art pose estimation approaches for spaceborne applications, the SPN method does not require the formulation of hand-engineered features and only requires a single grayscale image to determine the pose of the spacecraft relative to the camera. The SPN method uses a Convolutional Neural Network (CNN) with three branches to solve the problems of relative attitude and relative position estimation. The first branch of the CNN bootstraps a state-of-the-art object detection algorithm to detect a 2D bounding box around the target spacecraft in the input image. The region inside the 2D bounding box is then used by the other two branches of the CNN to determine the relative attitude by initially classifying the input region into discrete coarse attitude labels before regressing to a finer estimate. The SPN method then uses a novel Gauss-Newton algorithm to estimate the relative position by using the constraints imposed by the detected 2D bounding box and the estimated relative attitude. The secondary contribution of this work is the generation of the Spacecraft PosE Estimation Dataset (SPEED), which is used to train and evaluate the performance of the SPN method. SPEED consists of synthetic as well as actual camera images of a mock-up of the Tango spacecraft from the PRISMA mission. The synthetic images are created by fusing OpenGL-based renderings of the spacecraft's 3D model with actual images of the Earth captured by the Himawari-8 meteorological satellite. The actual camera images are created using a 7 degrees-of-freedom robotic arm, which positions and orients a vision-based sensor with respect to a full-scale mock-up of the Tango spacecraft. Custom illumination devices simulate the Earth albedo and Sun light with high fidelity to emulate the illumination conditions present in space. The SPN method, trained only on synthetic images, produces degree-level relative attitude error and cm-level relative position errors when evaluated on the actual camera images not used during training.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.