Abstract-There is an analogy between single-chip color cameras and the human visual system in that these two systems acquire only one limited wavelength sensitivity band per spatial location. We have exploited this analogy, defining a model that characterizes a one-color per spatial position image as a coding into luminance and chrominance of the corresponding three-colors per spatial position image. Luminance is defined with full spatial resolution while chrominance contains sub-sampled opponent colors. Moreover, luminance and chrominance follow a particular arrangement in the Fourier domain, allowing for demosaicing by spatial frequency filtering. This model shows that visual artifacts after demosaicing are due to aliasing between luminance and chrominance and could be solved using a pre-processing filter. This approach also gives new insights for the representation of single-color pe r spatial location images and enables formal and controllable procedures to design demosaicing algorithms that perform well compared to concurrent approaches, as demonstrated by experiments.
We present a tone mapping algorithm that is derived from a model of retinal processing. Our approach has two major improvements over existing methods. First, tone mapping is applied directly on the mosaic image captured by the sensor, analogous to the human visual system that applies a nonlinearity to the chromatic responses captured by the cone mosaic. This reduces the number of necessary operations by a factor 3. Second, we introduce a variation of the center/surround class of local tone mapping algorithms, which are known to increase the local contrast of images but tend to create artifacts. Our method gives a good improvement in contrast while avoiding halos and maintaining good global appearance. Like traditional center/surround algorithms, our method uses a weighted average of surrounding pixel values. Instead of being used directly, the weighted average serves as a variable in the Naka-Rushton equation, which models the photoreceptors' nonlinearity. Our algorithm provides pleasing results on various images with different scene content and dynamic range.
We present a new algorithm that performs demosaicing and super-resolution jointly from a set of raw images sampled with a color filter array. Such a combined approach allows us to compute the alignment parameters between the images on the raw camera data before interpolation artifacts are introduced. After image registration, a high resolution color image is reconstructed at once using the full set of images. For this, we use normalized convolution, an image interpolation method from a nonuniform set of samples. Our algorithm is tested and compared to other approaches in simulations and practical experiments.
From moonlight to bright sun shine, real world visual scenes contain a very wide range of luminance; they are said to be High Dynamic Range (HDR). Our visual system is well adapted to explore and analyze such a variable visual content. It is now possible to acquire such HDR contents with digital cameras; however it is not possible to render them all on standard displays, which have only Low Dynamic Range (LDR) capabilities. This rendering usually generates bad exposure or loss of information. It is necessary to develop locally adaptive Tone Mapping Operators (TMO) to compress a HDR content to a LDR one and keep as much information as possible. The human retina is known to perform such a task to overcome the limited range of values which can be coded by neurons. The purpose of this paper is to present a TMO inspired from the retina properties. The presented biological model allows reliable dynamic range compression with natural color constancy properties. Moreover, its non-separable spatio-temporal filter enhances HDR video content processing with an added temporal constancy.
Background. Common manufactured depth sensors generate depth images that humans normally obtain from their eyes and hands. Various designs converting spatial data into sound have been recently proposed, speculating on their applicability as sensory substitution devices (SSDs). Objective. We tested such a design as a travel aid in a navigation task. Methods. Our portable device (MeloSee) converted 2D array of a depth image into melody in real-time. Distance from the sensor was translated into sound intensity, stereo-modulated laterally, and the pitch represented verticality. Twenty-one blindfolded young adults navigated along four different paths during two sessions separated by one-week interval. In some instances, a dual task required them to recognize a temporal pattern applied through a tactile vibrator while they navigated. Results. Participants learnt how to use the system on both new paths and on those they had already navigated from. Based on travel time and errors, performance improved from one week to the next. The dual task was achieved successfully, slightly affecting but not preventing effective navigation. Conclusions. The use of Kinect-type sensors to implement SSDs is promising, but it is restricted to indoor use and it is inefficient on too short range.
A recent brain imaging study (Vuilleumier, Armony, Driver and Dolan 2003, Nature Neuroscience, 6, 624-631) has shown that amygdala responses to fearful expressions are preferentially driven by intact or low spatial frequency (LSF) images of faces, rather than by high spatial frequency (HSF) images. These results suggest that LSF components processed rapidly via magnocellular pathways within the visual system might be very efficiently conveyed to the amygdala for the rapid recognition of fearful expressions, perhaps via a subcortical pathway that activates the pulvinar and superior colliculus, but which bypasses any finer visual analysis of HSF cues in the striate and temporal extrastriate cortex. The purpose of this paper is to analyse the statistical properties of LSF compared with HSF and intact faces. The statistical analysis shows that the LSF components in faces, which are typically extracted rapidly by the visual system, provide a better source of information than HSF components for the correct categorisation of fearful expressions in faces. These results support the idea that visual pathways from the magnocellular visual neurons might be optimal, at a computational level, for the rapid classification of fearful emotional expressions in human faces
The hue discrimination curve (HDC) that characterizes performances over the entire hue circle was determined by using sinusoidally modulated spectral power distributions of 1.5 c/300 nm with fixed amplitude and twelve reference phases. To investigate relationship between hue discrimination and appearance, observers further performed a free color naming and unique hue tasks. The HDC consistently displayed two minima and two maxima; discrimination is optimal at the yellow/orange and blue/magenta boundaries and pessimal in green and in the extra-spectral magenta colors. A linear model based on Müller zone theory correctly predicts a periodical profile but with a phase-opponency (minima/maxima at 180° apart) which is inconsistent with the empirical HDC's profile.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.