2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.00466
|View full text |Cite
|
Sign up to set email alerts
|

PointPainting: Sequential Fusion for 3D Object Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
384
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 599 publications
(391 citation statements)
references
References 23 publications
0
384
0
Order By: Relevance
“…Then, they removed the 3D convolutional module and processed the pseudo-image to high-level representation merely with 2D convolutional blocks. The PointPainting [16] was an effective sequential fusion method, which used a semantic segmentation network prediction from RGB images to enhance the point cloud features. These one-stage 3D detection heads adopted a set of predefined 3D anchor boxes.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Then, they removed the 3D convolutional module and processed the pseudo-image to high-level representation merely with 2D convolutional blocks. The PointPainting [16] was an effective sequential fusion method, which used a semantic segmentation network prediction from RGB images to enhance the point cloud features. These one-stage 3D detection heads adopted a set of predefined 3D anchor boxes.…”
Section: Related Workmentioning
confidence: 99%
“…In the bird's eye view evaluation, the VFE detection results achieved (91.58, 85.83, 80.54) for the three levels, respectively. Although our one-stage anchor-free network does not need any prior information about the anchor boxes during the training and prediction process, it acquires the same performance as other anchor-based, one-stage 3D detectors, such as SCNet [39], SECOND [4] and PointPainting [16].…”
Section: Experiments On the Kitti Test Setmentioning
confidence: 99%
“…If the raw sensor data are not well aligned in the early stage, it would lead to heavy performance degradation due to the feature dislocation. Depending on coordinate location of two sensors, PointPainting [ 5 ] and PI-RCNN [ 6 ] project the image semantic segmentation to point cloud space by projecting matrix. Although this early fusion process enables the network to handle aligned two-modality information as a whole without specific modality adjustment, the early stage fusion also conveys the noise in one modality to another modality.…”
Section: Introductionmentioning
confidence: 99%
“…Then, PointNets can be applied for 3D bounding box estimation, but the overall procedure heavily relies on the performance of 2D detectors. PointPainting [30] feeds the pixel-wise semantic features captured from the image-based semantic segmentation model onto corresponding point-wise semantic features in the point cloud to boost the performance of 3D object detection.…”
mentioning
confidence: 99%
“…It can be observed that the main disadvantage of using dense point-pixel fusion methods such as [30] is that it leads to a considerable amount of redundant computations. Meanwhile, using a BEV-image fusion method allows the deep learningbased fusion of the feature maps captured from an individual viewpoint but with geometric information losses.…”
mentioning
confidence: 99%