Due to higher resolutions and refresh rates, as well as more photorealistic effects, real-time rendering has become increasingly challenging for video games and emerging virtual reality headsets. To meet this demand, modern graphics hardware and game engines often reduce the computational cost by rendering at a lower resolution and then upsampling to the native resolution. Following the recent advances in image and video superresolution in computer vision, we propose a machine learning approach that is specifically tailored for high-quality upsampling of rendered content in real-time applications. The main insight of our work is that in rendered content, the image pixels are point-sampled, but precise temporal dynamics are available. Our method combines this specific information that is typically available in modern renderers (i.e., depth and dense motion vectors) with a novel temporal network design that takes into account such specifics and is aimed at maximizing video quality while delivering real-time performance. By training on a large synthetic dataset rendered from multiple 3D scenes with recorded camera motion, we demonstrate high fidelity and temporally stable results in real-time, even in the highly challenging 4 × 4 upsampling scenario, significantly outperforming existing superresolution and temporal antialiasing work.
Conventional binocular head-mounted displays (HMDs) vary the stimulus to vergence with the information in the picture, while the stimulus to accommodation remains fixed at the apparent distance of the display, as created by the viewing optics. Sustained vergence-accommodation conflict (VAC) has been associated with visual discomfort, motivating numerous proposals for delivering near-correct accommodation cues. We introduce
focal surface displays
to meet this challenge, augmenting conventional HMDs with a phase-only spatial light modulator (SLM) placed between the display screen and viewing optics. This SLM acts as a dynamic freeform lens, shaping synthesized focal surfaces to conform to the virtual scene geometry. We introduce a framework to decompose target focal stacks and depth maps into one or more pairs of piecewise smooth focal surfaces and underlying display images. We build on recent developments in "optimized blending" to implement a multifocal display that allows the accurate depiction of occluding, semi-transparent, and reflective objects. Practical benefits over prior accommodation-supporting HMDs are demonstrated using a binocular focal surface display employing a liquid crystal on silicon (LCOS) phase SLM and an organic light-emitting diode (OLED) display.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.