György Dénes scite author profile

Camera sensors can only capture a limited range of luminance simultaneously, and in order to create high dynamic range (HDR) images a set of different exposures are typically combined. In this paper we address the problem of predicting information that have been lost in saturated image areas, in order to enable HDR reconstruction from a single exposure. We show that this problem is well-suited for deep learning algorithms, and propose a deep convolutional neural network (CNN) that is specifically designed taking into account the challenges in predicting HDR values. To train the CNN we gather a large dataset of HDR images, which we augment by simulating sensor saturation for a range of cameras. To further boost robustness, we pre-train the CNN on a simulated HDR dataset created from a subset of the MIT Places database. We demonstrate that our approach can reconstruct high-resolution visually convincing HDR results in a wide range of situations, and that it generalizes well to reconstruction of images captured with arbitrary and low-end cameras that use unknown camera response functions and post-processing. Furthermore, we compare to existing methods for HDR expansion, and show high quality results also for image based lighting. Finally, we evaluate the results in a subjective experiment performed on an HDR display. This shows that the reconstructed HDR images are visually convincing, with large improvements as compared to existing methods.Comment: 15 pages, 19 figures, Siggraph Asia 2017. Project webpage located at http://hdrv.org/hdrcnn/ where paper with high quality images is available, as well as supplementary material (document, images, video and source code

show abstract

A perceptual model of motion quality for rendering with adaptive refresh-rate and resolution

Dénes

Jindal

Mikhailiuk

et al. 2020

ACM Trans. Graph.

View full text Add to dashboard Cite

Limited GPU performance budgets and transmission bandwidths mean that real-time rendering often has to compromise on the spatial resolution or temporal resolution (refresh rate). A common practice is to keep either the resolution or the refresh rate constant and dynamically control the other variable. But this strategy is non-optimal when the velocity of displayed content varies. To find the best trade-off between the spatial resolution and refresh rate, we propose a perceptual visual model that predicts the quality of motion given an object velocity and predictability of motion. The model considers two motion artifacts to establish an overall quality score: non-smooth (juddery) motion, and blur. Blur is modeled as a combined effect of eye motion, finite refresh rate and display resolution. To fit the free parameters of the proposed visual model, we measured eye movement for predictable and unpredictable motion, and conducted psychophysical experiments to measure the quality of motion from 50 Hz to 165 Hz. We demonstrate the utility of the model with our on-the-fly motion-adaptive rendering algorithm that adjusts the refresh rate of a G-Sync-capable monitor based on a given rendering budget and observed object motion. Our psychophysical validation experiments demonstrate that the proposed algorithm performs better than constant-refresh-rate solutions, showing that motion-adaptive rendering is an attractive technique for driving variable-refresh-rate displays.

show abstract

FovVideoVDP

et al. 2021

View full text Add to dashboard Cite

FovVideoVDP is a video difference metric that models the spatial, temporal, and peripheral aspects of perception. While many other metrics are available, our work provides the first practical treatment of these three central aspects of vision simultaneously. The complex interplay between spatial and temporal sensitivity across retinal locations is especially important for displays that cover a large field-of-view, such as Virtual and Augmented Reality displays, and associated methods, such as foveated rendering. Our metric is derived from psychophysical studies of the early visual system, which model spatio-temporal contrast sensitivity, cortical magnification and contrast masking. It accounts for physical specification of the display (luminance, size, resolution) and viewing distance. To validate the metric, we collected a novel foveated rendering dataset which captures quality degradation due to sampling and reconstruction. To demonstrate our algorithm's generality, we test it on 3 independent foveated video datasets, and on a large image quality dataset, achieving the best performance across all datasets when compared to the state-of-the-art.

show abstract

Temporal Resolution Multiplexing: Exploiting the limitations of spatio-temporal vision for more efficient VR rendering

Dénes

Maruszczyk

Ash

et al. 2019

IEEE Trans. Visual. Comput. Graphics

View full text Add to dashboard Cite

Rendered frames GPU / Rendering Perceived stimulus Decoding & display Transmission Figure 1: Our technique renders every second frame at a lower resolution to save on rendering time and data transmission bandwidth. Before the frames are displayed, the low resolution frames are upsampled and high-resolution frames are compensated for the lost information. When such a sequence is viewed at a high frame rate, the frames are perceived as though they were rendered at full resolution.

show abstract

A visual model for predicting chromatic banding artifacts

Dénes¹,

Ash²,

Fang³

et al. 2019

View full text Add to dashboard Cite

Quantization of images containing low texture regions, such as sky, water or skin, can produce banding artifacts. As the bitdepth of each color channel is decreased, smooth image gradients are transformed into perceivable, wide, discrete bands. Commonly used quality metrics cannot reliably measure the visibility of such artifacts. In this paper we introduce a visual model for predicting the visibility of both luminance and chrominance banding artifacts in image gradients spanning between two arbitrary points in a color space. The model analyzes the error introduced by quantization in the Fourier space, and employs a purpose-built spatio-chromatic contrast sensitivity function to predict its visibility. The output of the model is a detection probability, which can be then used to compute the minimum bit-depth for which banding artifacts are just-noticeable. We demonstrate that the model can accurately predict the results of our psychophysical experiments.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

György Dénes

HDR image reconstruction from a single exposure using deep CNNs

A perceptual model of motion quality for rendering with adaptive refresh-rate and resolution

FovVideoVDP

Temporal Resolution Multiplexing: Exploiting the limitations of spatio-temporal vision for more efficient VR rendering

A visual model for predicting chromatic banding artifacts

Contact Info

Product

Resources

About