Three-dimensional (3D) object detection is an important research in 3D computer vision with significant applications in many fields, such as automatic driving, robotics, and human–computer interaction. However, the low precision is an urgent problem in the field of 3D object detection. To solve it, we present a framework for 3D object detection in point cloud. To be specific, a designed Backbone Network is used to make fusion of low-level features and high-level features, which makes full use of various information advantages. Moreover, the two-dimensional (2D) Generalized Intersection over Union is extended to 3D use as part of the loss function in our framework. Empirical experiments of Car, Cyclist, and Pedestrian detection have been conducted respectively on the KITTI benchmark. Experimental results with average precision (AP) have shown the effectiveness of the proposed network.
With the rapid development of range image acquisition techniques, 3D computer vision has became a popular research area. It has numerous applications in various domains including robotics, biometrics, remote sensing, entertainment, civil construction, and medical treatment. Recently, a large number of algorithms have been proposed to address specific problems in the area of 3D computer vision. Meanwhile, several benchmark datasets have also been released to stimulate the research in this area. The availability of benchmark datasets plays an significant role in the process of technological progress. In this paper, we first introduce several major 3D acquisition techniques. We also present an overview on various popular topics in 3D computer vision including 3D object modeling, 3D model retrieval, 3D object recognition, 3D face recognition, RGB-D vision, and 3D remote sensing. Moreover, we present a contemporary summary of the existing benchmark datasets in 3D computer vision. This paper can therefore, serve as a handbook for those who are working in the related areas.
Recognizing 3D objects from point clouds in the presence of significant clutter and occlusion is a highly challenging task. In this paper, we present a coarse-to-fine 3D object recognition algorithm. During the phase of offline training, each model is represented with a set of multi-scale local surface features. During the phase of online recognition, a set of keypoints are first detected from each scene. The local surfaces around these keypoints are further encoded with multi-scale feature descriptors. These scene features are then matched against all model features to generate recognition hypotheses, which include model hypotheses and pose hypotheses. Finally, these hypotheses are verified to produce recognition results. The proposed algorithm was tested on two standard datasets, with rigorous comparisons to the state-of-the-art algorithms. Experimental results show that our algorithm was fully automatic and highly effective. It was also very robust to occlusion and clutter. It achieved the best recognition performance on all of these datasets, showing its superiority compared to existing algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.