VENet: Voting Enhancement Network for 3D Object Detection

Xie, Qian; Lai, Yuekun; Wu, Jing; Wang, Zhoutao; Lu, Dening; Wei, Mingqiang; Wang, Jun

doi:10.1109/iccv48922.2021.00369

Cited by 40 publications

(11 citation statements)

References 45 publications

(42 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The results are summarized in Table 2. With the same backbone network of a standard PointNet++, our approach achieves 70.2 mAP@0.25 and 54.2 mAP@0.5 using 66 rays and 256 object candidates, which is 2.5 and 3.3 better than previous best methods [42,7] using the same backbones. With stronger backbones and more sampled object candidates just like [21], i.e., 2× more channels and 512 candidates, our approach is also improved dramatically, achieving 70.6 mAP@0.25 and 55.2 mAP@0.5, which is still 1.5 and 2.4 better than [21].…”

Section: Comparison With State-of-the-art Methodsmentioning

confidence: 93%

RBGNet: Ray-based Grouping for 3D Object Detection

Wang¹,

Shi²,

Yang³

et al. 2022

Preprint

View full text Add to dashboard Cite

As a fundamental problem in computer vision, 3D object detection is experiencing rapid growth. To extract the point-wise features from the irregularly and sparsely distributed points, previous methods usually take a feature grouping module to aggregate the point features to an object candidate. However, these methods have not yet leveraged the surface geometry of foreground objects to enhance grouping and 3D box generation. In this paper, we propose the RBGNet framework, a voting-based 3D detector for accurate 3D object detection from point clouds. In order to learn better representations of object shape to enhance cluster features for predicting 3D boxes, we propose a ray-based feature grouping module, which aggregates the point-wise features on object surfaces using a group of determined rays uniformly emitted from cluster centers. Considering the fact that foreground points are more meaningful for box estimation, we design a novel foreground biased sampling strategy in downsample process to sample more points on object surfaces and further boost the detection performance. Our model achieves state-of-the-art 3D detection performance on ScanNet V2 and SUN RGB-D with remarkable performance gains. Code will be available at https://github.com/Haiyang-W/RBGNet.

show abstract

Section: Comparison With State-of-the-art Methodsmentioning

confidence: 93%

RBGNet: Ray-based Grouping for 3D Object Detection

Wang¹,

Shi²,

Yang³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…To evaluate point-based GHA for object detection, we experiment on ScanNet detection dataset [11], which contains 1,513 indoor scans with annotated bounding boxes, split into 1,201 scenes for training and 312 for validation. Following prior works [36,6,70,59], we report the mean average precision, specifically mAP 25 and mAP 50 , computed at a 0.25 and 0.5 IoU threshold respectively. Setting.…”

Section: Point-gha: 3d Object Detectionmentioning

confidence: 99%

Global Hierarchical Attention for 3D Point Cloud Analysis

Jia¹,

Hermans²,

Leibe³

2022

Preprint

View full text Add to dashboard Cite

We propose a new attention mechanism, called Global Hierarchical Attention (GHA), for 3D point cloud analysis. GHA approximates the regular global dot-product attention via a series of coarsening and interpolation operations over multiple hierarchy levels. The advantage of GHA is two-fold. First, it has linear complexity with respect to the number of points, enabling the processing of large point clouds. Second, GHA inherently possesses the inductive bias to focus on spatially close points, while retaining the global connectivity among all points. Combined with a feedforward network, GHA can be inserted into many existing network architectures. We experiment with multiple baseline networks and show that adding GHA consistently improves performance across different tasks and datasets. For the task of semantic segmentation, GHA gives a +1.7% mIoU increase to the MinkowskiEngine baseline on ScanNet. For the 3D object detection task, GHA improves the CenterPoint baseline by +0.5% mAP on the nuScenes dataset, and the 3DETR baseline by +2.1% mAP25 and +1.5% mAP50 on ScanNet.

show abstract

“…Another technical branch is point-based methods. Point clouds have emerged as a great powerful representation for 3D deep learning tasks, such as classification [15][16][17][18][19][20], semantic segmentation [21][22][23], point cloud normal estimation [24], 3D reconstruction [25][26][27], and 3D object detection [28][29][30][31]. Most of these works adopt raw point clouds to extract expressive representations based on pioneering work PointNet/PointNet++ [15,16].…”

Section: Introductionmentioning

confidence: 99%

REGNet: Ray-Based Enhancement Grouping for 3D Object Detection Based on Point Cloud

Zhou,

Rao,

Shen

et al. 2023

Applied Sciences

View full text Add to dashboard Cite

Currently, 3D objects are usually represented by 3D bounding boxes. Much research work has focused on detecting 3D objects directly from point clouds, and significant progress has been made in this field. However, we find there are there is still room for improvement in three aspects. First is point cloud feature extraction. Many successful methods are based on PointNet/PointNet++, which uses multi-layer perceptrons (MLP) to extract features to generate seed points, without considering foreground and background clues. The second aspect is grouping. The “vote-based cluster” grouping method defined by the pioneering VoteNet ignores shape information that is very important in the object detection field. The final aspect is the modeling ability of grouped clusters. Most successful methods treat grouped clusters separately, regardless of their different contributions to the final detection. To address these challenges, we propose three modules to address them: the foreground-aware module, the voting-aware module, and the cluster-aware module. Extensive experiments on two large datasets of real 3D scans, ScanNet and SUN RGB-D, demonstrate the effectiveness of our method for 3D object detection on point clouds.

show abstract

VENet: Voting Enhancement Network for 3D Object Detection

Cited by 40 publications

References 45 publications

RBGNet: Ray-based Grouping for 3D Object Detection

RBGNet: Ray-based Grouping for 3D Object Detection

Global Hierarchical Attention for 3D Point Cloud Analysis

REGNet: Ray-Based Enhancement Grouping for 3D Object Detection Based on Point Cloud

Contact Info

Product

Resources

About