Xiaozhi Chen scite author profile

This paper aims at high-accuracy 3D object detection in autonomous driving scenario. We propose Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes. We encode the sparse 3D point cloud with a compact multi-view representation. The network is composed of two subnetworks: one for 3D object proposal generation and another for multi-view feature fusion. The proposal network generates 3D candidate boxes efficiently from the bird's eye view representation of 3D point cloud. We design a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths. Experiments on the challenging KITTI benchmark show that our approach outperforms the state-of-the-art by around 25% and 30% AP on the tasks of 3D localization and 3D detection. In addition, for 2D detection, our approach obtains 10.3% higher AP than the state-of-the-art on the hard data among the LIDAR-based methods.

show abstract

Monocular 3D Object Detection for Autonomous Driving

Chen

et al. 2016

View full text Add to dashboard Cite

3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection

Chen

Kundu

Zhu

et al. 2018

IEEE Trans. Pattern Anal. Mach. Intell.

544

628

View full text Add to dashboard Cite

The goal of this paper is to perform 3D object detection in the context of autonomous driving. Our method aims at generating a set of high-quality 3D object proposals by exploiting stereo imagery. We formulate the problem as minimizing an energy function that encodes object size priors, placement of objects on the ground plane as well as several depth informed features that reason about free space, point cloud densities and distance to the ground. We then exploit a CNN on top of these proposals to perform object detection. In particular, we employ a convolutional neural net (CNN) that exploits context and depth information to jointly regress to 3D bounding box coordinates and object pose. Our experiments show significant performance gains over existing RGB and RGB-D object proposal methods on the challenging KITTI benchmark. When combined with the CNN, our approach outperforms all existing results in object detection and orientation estimation tasks for all three KITTI object classes. Furthermore, we experiment also with the setting where LIDAR information is available, and show that using both LIDAR and stereo leads to the best result.

show abstract

Stereo R-CNN Based 3D Object Detection for Autonomous Driving

Chen

Shen³

2019

501

335

View full text Add to dashboard Cite

We propose a 3D object detection method for autonomous driving by fully exploiting the sparse and dense, semantic and geometry information in stereo imagery. Our method, called Stereo R-CNN, extends Faster R-CNN for stereo inputs to simultaneously detect and associate object in left and right images. We add extra branches after stereo Region Proposal Network (RPN) to predict sparse keypoints, viewpoints, and object dimensions, which are combined with 2D left-right boxes to calculate a coarse 1 3D object bounding box. We then recover the accurate 3D bounding box by a region-based photometric alignment using left and right RoIs. Our method does not require depth input and 3D position supervision, however, outperforms all existing fully supervised image-based methods. Experiments on the challenging KITTI dataset show that our method outperforms the state-of-the-art stereobased method by around 30% AP on both 3D detection and 3D localization tasks. Code has been released at https://github.com/HKUST-Aerial-Robotics/Stereo-RCNN.

show abstract

Acoustic emission method for tool condition monitoring based on wavelet analysis

Chen

2006

Int J Adv Manuf Technol

View full text Add to dashboard Cite

Geometry-based Distance Decomposition for Monocular 3D Object Detection

Shi

Chen

et al. 2021

View full text Add to dashboard Cite

Multi-View 3D Object Detection Network for Autonomous Driving

Chen

Wan

et al. 2016

Preprint

View full text Add to dashboard Cite

On the Over-Smoothing Problem of CNN Based Disparity Estimation

Chen

Cheng

2019

View full text Add to dashboard Cite

12 3 4 5

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xiaozhi Chen

Multi-view 3D Object Detection Network for Autonomous Driving

Monocular 3D Object Detection for Autonomous Driving

3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection

Stereo R-CNN Based 3D Object Detection for Autonomous Driving

Acoustic emission method for tool condition monitoring based on wavelet analysis

Geometry-based Distance Decomposition for Monocular 3D Object Detection

Multi-View 3D Object Detection Network for Autonomous Driving

On the Over-Smoothing Problem of CNN Based Disparity Estimation

Contact Info

Product

Resources

About