Video compression in automated vehicles and advanced driving assistance systems is of utmost importance to deal with the challenge of transmitting and processing the vast amount of video data generated per second by the sensor suite which is needed to support robust situational awareness.
The objective of this paper is to demonstrate that video compression can be optimised based on the perception system that will utilise the data. We have considered the deployment of deep neural networks to implement object (i.e. vehicle) detection based on compressed video camera data extracted
from the KITTI MoSeg dataset. Preliminary results indicate that re-training the neural network with M-JPEG compressed videos can improve the detection performance with compressed and uncompressed transmitted data, improving recalls and precision by up to 4% with respect to re-training with
uncompressed data.
How to cite:Please refer to published version for the most recent bibliographic citation information. If a published version is known of, the repository item page linked to above, will contain details on accessing it.
<p>Whilst Deep Neural Networks have been developing swiftly, most of the research has been focused on RGB image. This type of image has been traditionally optimised for human vision. However, RGB data is a highly re-elaborated and interpolated version of the collected raw data (i.e. the sensor collects one value per pixel), but an RGB image for human viewing contains 3 values, for red, green, and blue. This processing through the ISP (Image Signal Processing) requires computational resource, time, power and obviously increases by a factor of three the amount of output data. This work investigates Deep Neural Network based detection using (for training and evaluation) Bayer data, generated in different ways, from a benchmarking automotive dataset (i.e. KITTI dataset). A Deep Neural Network (DNN) is deployed in unmodified form, and also modified to accept only single field images, such as Bayer frames. Eleven different re-trained version of the DNN are produced, and cross-evaluated across different data formats. The results demonstrate that the selected DNN has the same accuracy when evaluating RGB or Bayer data, without significant degradation in the perception (the variation of the Average Precision is <1%). Moreover, the colour filter array position and the colour correction matrix do not seem to contribute significantly to the DNN performance. This work demonstrates that Bayer data can be used for object detection in automotive without significant performance loss, and that the processing currently used in ISP can be avoided, allowing for more efficient sensing-perception systems. </p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.