2021
DOI: 10.1109/lra.2021.3062324
|View full text |Cite
|
Sign up to set email alerts
|

OmniDet: Surround View Cameras Based Multi-Task Visual Perception Network for Autonomous Driving

Abstract: Surround View fisheye cameras are commonly deployed in automated driving for 360°near-field sensing around the vehicle. This work presents a multi-task visual perception network on unrectified fisheye images to enable the vehicle to sense its surrounding environment. It consists of six primary tasks necessary for an autonomous driving system: depth estimation, visual odometry, semantic segmentation, motion segmentation, object detection, and lens soiling detection. We demonstrate that the jointly trained model… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
28
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

3
6

Authors

Journals

citations
Cited by 64 publications
(28 citation statements)
references
References 40 publications
0
28
0
Order By: Relevance
“…For the next generation, a unified CNN model with high synergies would be the likely path. We have recently published an initial prototype Omnidet [114] showing joint modelling of reconstruction and recognition. Figure 16 illustrates its high level architecture with cross links shown across the different tasks.…”
Section: Synergies In Next Generationmentioning
confidence: 99%
See 1 more Smart Citation
“…For the next generation, a unified CNN model with high synergies would be the likely path. We have recently published an initial prototype Omnidet [114] showing joint modelling of reconstruction and recognition. Figure 16 illustrates its high level architecture with cross links shown across the different tasks.…”
Section: Synergies In Next Generationmentioning
confidence: 99%
“…Fig.16: Overview of our next generation unified multi-task visual perception framework. Refer to our OmniDet paper[114] for more details.…”
mentioning
confidence: 99%
“…Segmentation of panoramic data, which is often captured through distortion-pronounced fisheye lenses [38], [39], [40] or multiple surround-view cameras [41], [42], [43], is challenging as it entails a set of hard tasks like distortion elimination, camera synchronization and calibration, as well as data fusion, resulting in higher latency and complexity. Yang et al introduce the PASS [7] and the DS-PASS [44] frameworks which naturally mitigate the effect of distortions by using a single-shot panoramic annular lens system, but come with an expensive memory-and computation cost, as it requires separating the panorama into multiple partitions for predictions, each resembling a narrow-FoV pinhole image.…”
Section: B Semantic Segmentation For 360 • Panoramic Imagesmentioning
confidence: 99%
“…We start with the OmniDet [18] motion segmentation network using two-stream RGB only network. The network consists of two ResNet18 streams with shared weights and a motion segmentation decoder with deconv layers for upsampling to the higher resolution output.…”
Section: A Baseline Architecturementioning
confidence: 99%