2022
DOI: 10.1007/978-3-031-19839-7_23
|View full text |Cite
|
Sign up to set email alerts
|

CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for Robust 3D Object Detection

Abstract: Robust 3D object detection is critical for safe autonomous driving. Camera and radar sensors are synergistic as they capture complementary information and work well under different environmental conditions. Fusing camera and radar data is challenging, however, as each of the sensors lacks information along a perpendicular axis, that is, depth is unknown to camera and elevation is unknown to radar. We propose the camera-radar matching network CramNet, an efficient approach to fuse the sensor readings from camer… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(11 citation statements)
references
References 54 publications
(102 reference statements)
0
6
0
Order By: Relevance
“…For the proposed method, the most relevant baseline to compare against is FFT-RadNet. This is because the FFT-RadNet method is specifically designed for the RADIal dataset: a MIMO pre-encoder module (a dilated convolution layer) is designed to fully exploit the Doppler division multiplexing (DDM) scheme used on this radar, while other methods such as those in [19,12,36,17,26,16,24] are designed for other types of radar data, such as radar intensity map, radar radio frequency image, etc., so they are unlikely to be competitive on the RADIal dataset without major changes. Futhermore, to highlight the effects of our novel techniques, the proposed ADCNet is following FFT-RadNet, with a novel learnable SP module incorporated into the neural network.…”
Section: Baseline Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…For the proposed method, the most relevant baseline to compare against is FFT-RadNet. This is because the FFT-RadNet method is specifically designed for the RADIal dataset: a MIMO pre-encoder module (a dilated convolution layer) is designed to fully exploit the Doppler division multiplexing (DDM) scheme used on this radar, while other methods such as those in [19,12,36,17,26,16,24] are designed for other types of radar data, such as radar intensity map, radar radio frequency image, etc., so they are unlikely to be competitive on the RADIal dataset without major changes. Futhermore, to highlight the effects of our novel techniques, the proposed ADCNet is following FFT-RadNet, with a novel learnable SP module incorporated into the neural network.…”
Section: Baseline Methodsmentioning
confidence: 99%
“…There is a growing body of work attacking the AV perception problem by means of low-level radar data [19,12,36,17,26,16,24]. CramNet [12] proposes a method for fusing radar Range-Azimuth (RA) images with camera images using the attention mechanism. In [19], a graph convolution network is developed to work with radar RA data.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…RGB-RF Fusion. To enhance robustness in dark or adverse weather conditions (Qian et al 2021), several studies have investigated the fusion of RGB and RF modalities (Long et al 2021b,a;Bijelic et al 2020;Nabati and Qi 2021;Cheng, Xu, and Liu 2021;Hwang et al 2022). Most of these studies have focused on outdoor sensing applications, such as autonomous driving (Nabati and Qi 2021;Cheng, Xu, and Liu 2021;Dong et al 2021;Hwang et al 2022), where RGB images and 2D RF bird-eye-view (BEV) images have been spatially fused.…”
Section: Related Workmentioning
confidence: 99%
“…LiDAR-camera fusion is arguably the most common and well-studied modality fusion configuration [19,29,32,43,44]. There is also work on camera-radar fusion [13]. Fusion methods generally combine information at input level (early fusion), feature level, or decision level (late fusion) [30].…”
Section: Related Workmentioning
confidence: 99%