Intelligent driver assistance systems are becoming increasingly popular in modern passenger vehicles. A crucial component of intelligent vehicles is the ability to detect vulnerable road users (VRUs) for an early and safe response. However, standard imaging sensors perform poorly in conditions of strong illumination contrast, such as approaching a tunnel or at night, due to their dynamic range limitations. In this paper, we focus on the use of high-dynamic-range (HDR) imaging sensors in vehicle perception systems and the subsequent need for tone mapping of the acquired data into a standard 8-bit representation. To our knowledge, no previous studies have evaluated the impact of tone mapping on object detection performance. We investigate the potential for optimizing HDR tone mapping to achieve a natural image appearance while facilitating object detection of state-of-the-art detectors designed for standard dynamic range (SDR) images. Our proposed approach relies on a lightweight convolutional neural network (CNN) that tone maps HDR video frames into a standard 8-bit representation. We introduce a novel training approach called detection-informed tone mapping (DI-TM) and evaluate its performance with respect to its effectiveness and robustness in various scene conditions, as well as its performance relative to an existing state-of-the-art tone mapping method. The results show that the proposed DI-TM method achieves the best results in terms of detection performance metrics in challenging dynamic range conditions, while both methods perform well in typical, non-challenging conditions. In challenging conditions, our method improves the detection F2 score by 13%. Compared to SDR images, the increase in F2 score is 49%.
Interpolation from a Color Filter Array (CFA) is the most common method for obtaining full color image data. Its success relies on the smart combination of a CFA and a demosaicing algorithm. Demosaicing on the one hand has been extensively studied. Algorithmic development in the past 20 years ranges from simple linear interpolation to modern neural-network-based (NN) approaches that encode the prior knowledge of millions of training images to fill in missing data in an inconspicious way. CFA design, on the other hand, is less well studied, although still recognized to strongly impact demosaicing performance. This is because demosaicing algorithms are typically limited to one particular CFA pattern, impeding straightforward CFA comparison. This is starting to change with newer classes of demosaicing that may be considered generic or CFA-agnostic. In this study, by comparing performance of two state-of-the-art generic algorithms, we evaluate the potential of modern CFA-demosaicing. We test the hypothesis that, with the increasing power of NN-based demosaicing, the influence of optimal CFA design on system performance decreases. This hypothesis is supported with the experimental results. Such a finding would herald the possibility of relaxing CFA requirements, providing more freedom in the CFA design choice and producing high-quality cameras.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.