Environment perception remains one of the key tasks in autonomous driving for which solutions have yet to reach maturity. Multi-modal approaches benefit from the complementary physical properties specific to each sensor technology used, boosting overall performance. The added complexity brought on by data fusion processes is not trivial to solve, with design decisions heavily influencing the balance between quality and latency of the results. In this paper we present our novel real-time, 360∘ enhanced perception component based on low-level fusion between geometry provided by the LiDAR-based 3D point clouds and semantic scene information obtained from multiple RGB cameras, of multiple types. This multi-modal, multi-sensor scheme enables better range coverage, improved detection and classification quality with increased robustness. Semantic, instance and panoptic segmentations of 2D data are computed using efficient deep-learning-based algorithms, while 3D point clouds are segmented using a fast, traditional voxel-based solution. Finally, the fusion obtained through point-to-image projection yields a semantically enhanced 3D point cloud that allows enhanced perception through 3D detection refinement and 3D object classification. The planning and control systems of the vehicle receives the individual sensors’ perception together with the enhanced one, as well as the semantically enhanced 3D points. The developed perception solutions are successfully integrated onto an autonomous vehicle software stack, as part of the UP-Drive project.
The European UP-Drive project addresses transportation-related challenges by providing key contributions that enable fully automated vehicle navigation and parking in complex urban areas, which results in a safer, inclusive, affordable and environmentally friendly transportation system. For this purpose, the project consortium developed a prototype electrical vehicle equipped with cameras and LiDARs sensors that is capable to autonomously drive around the city and find available parking spots. In UP-Drive, we created an accurate, robust and redundant multi-modal environment perception system that provides 360 • coverage around the vehicle. This paper summarizes the work of the project related to the surround view semantic perception using fisheye and narrow field-of-view semantic virtual cameras. Deep learning-based semantic, instance and panoptic segmentation networks, which satisfy requirements in accuracy and efficiency have been developed and integrated into the final prototype. The UP-Drive automated vehicle has been successfully demonstrated in urban areas after extensive experiments and numerous field tests.
The driving environment is complex and dynamic, and the attention of the driver is continuously challenged, therefore computer based assistance achieved by processing image and sensor data may increase traffic safety. While active sensors and stereovision have the advantage of obtaining 3D data directly, monocular vision is easy to set up, and can benefit from the increasing computational power of smart mobile devices, and from the fact that almost all of them come with an embedded camera. Several driving assistance application are available for mobile devices, but they are mostly targeted for simple scenarios and a limited range of obstacle shapes and poses. This paper presents a technique for generic, shape independent real-time obstacle detection for mobile devices, based on a dynamic, free form 3D representation of the environment: the particle based occupancy grid. Images acquired in real time from the smart mobile device’s camera are processed by removing the perspective effect and segmenting the resulted bird-eye view image to identify candidate obstacle areas, which are then used to update the occupancy grid. The occupancy grid tracked cells are grouped into obstacles depicted as cuboids having position, size, orientation and speed. The easy to set up system is able to reliably detect most obstacles in urban traffic, and its measurement accuracy is comparable to a stereovision system.
This paper presents SmartCoDrive, an Android application which performs driving assistance functions: 3D lane detection and tracking, forward obstacle detection, obstacle tracking. With this mobile application we wish to increase the adoption rate of driving assistance systems and to provide a viable and cheap solution for every driver, that will be able to use his own tablet or smartphone as a personal driving assistant. The mobile application is deployed on a tablet equipped with dual back-facing cameras. The visual information from the two cameras, along with the data received from the Controller Area Network bus of the vehicle enable a thorough understanding of the 3D environment. First, we develop the sparse 3D reconstruction algorithm. Then, using monocular vision we perform lane markings detection. Obstacle detection is done by combining the superpixel segmentation with 3D information and the tracking algorithm is based on the Kalman Filter. Since the processing capabilities of the mobile platforms are limited, different optimizations are carried out in order to obtain a real-time implementation. The Android application may be used in urban traffic that is characterized by low-speed and short-medium distances to obstacles.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.