We present a novel deep convolutional network pipeline, LO-Net, for real-time lidar odometry estimation. Unlike most existing lidar odometry (LO) estimations that go through individually designed feature selection, feature matching, and pose estimation pipeline, LO-Net can be trained in an end-to-end manner. With a new maskweighted geometric constraint loss, LO-Net can effectively learn feature representation for LO estimation, and can implicitly exploit the sequential dependencies and dynamics in the data. We also design a scan-to-map module, which uses the geometric and semantic information learned in LO-Net, to improve the estimation accuracy. Experiments on benchmark datasets demonstrate that LO-Net outperforms existing learning based approaches and has similar accuracy with the state-of-the-art geometry-based approach, LOAM.
This paper proposes a new end-to-end trainable matching network based on receptive field, RF-Net, to compute sparse correspondence between images. Building end-toend trainable matching framework is desirable and challenging. The very recent approach, LF-Net, successfully embeds the entire feature extraction pipeline into a jointly trainable pipeline, and produces the state-of-the-art matching results. This paper introduces two modifications to the structure of LF-Net. First, we propose to construct receptive feature maps, which lead to more effective keypoint detection. Second, we introduce a general loss function term, neighbor mask, to facilitate training patch selection. This results in improved stability in descriptor training. We trained RF-Net on the open dataset HPatches, and compared it with other methods on multiple benchmark datasets. Experiments show that RF-Net outperforms existing state-of-the-art methods. * Corresponding author.to make them optimally cooperate with each other, hence, is more desirable. However, training such a network is difficult because the two subcomponents have their individually different objectives to optimize. Not many successful end-to-end matching pipelines have been reported in literatures. LIFT [29] is probably the first notable design towards this goal. However, LIFT relies on the output of SIFT detector to initialize the training, and hence, its detector behaves similarly to the SIFT detector. The recent network, SuperPoint [5], achieves this end-to-end training. But its detector needs to be pre-trained on synthetic image sets, and whole network is trained using images under synthesized affine transformations. The more recent LF-Net [18] is inspired by Q-learning, and uses a Siamese architecture to train the entire network without the help of any hand-craft method. In this paper, we develop an end-to-end matching network with enhanced detector and descriptor training modules, which we elaborate as follows. Feature Maps shared 5 × 5
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.