Fabio Viola scite author profile

Deep neural networks have achieved impressive successes in fields ranging from object recognition to complex games such as Go. Navigation, however, remains a substantial challenge for artificial agents, with deep neural networks trained by reinforcement learning failing to rival the proficiency of mammalian spatial behaviour, which is underpinned by grid cells in the entorhinal cortex . Grid cells are thought to provide a multi-scale periodic representation that functions as a metric for coding space and is critical for integrating self-motion (path integration) and planning direct trajectories to goals (vector-based navigation). Here we set out to leverage the computational functions of grid cells to develop a deep reinforcement learning agent with mammal-like navigational abilities. We first trained a recurrent network to perform path integration, leading to the emergence of representations resembling grid cells, as well as other entorhinal cell types . We then showed that this representation provided an effective basis for an agent to locate goals in challenging, unfamiliar, and changeable environments-optimizing the primary objective of navigation through deep reinforcement learning. The performance of agents endowed with grid-like representations surpassed that of an expert human and comparison agents, with the metric quantities necessary for vector-based navigation derived from grid-like units within the network. Furthermore, grid-like representations enabled agents to conduct shortcut behaviours reminiscent of those performed by mammals. Our findings show that emergent grid-like representations furnish agents with a Euclidean spatial metric and associated vector operations, providing a foundation for proficient navigation. As such, our results support neuroscientific theories that see grid cells as critical for vector-based navigation, demonstrating that the latter can be combined with path-based strategies to support navigation in challenging environments.

show abstract

Neural scene representation and rendering

Eslami

Rezende

Besse

et al. 2018

Science

396

319

View full text Add to dashboard Cite

Scene representation-the process of converting visual sensory data into concise descriptions-is a requirement for intelligent behavior. Recent work has shown that neural networks excel at this task when provided with large, labeled datasets. However, removing the reliance on human labeling remains an important open problem. To this end, we introduce the Generative Query Network (GQN), a framework within which machines learn to represent scenes using only their own sensors. The GQN takes as input images of a scene taken from different viewpoints, constructs an internal representation, and uses this representation to predict the appearance of that scene from previously unobserved viewpoints. The GQN demonstrates representation learning without human labels or domain knowledge, paving the way toward machines that autonomously learn to understand the world around them.

show abstract

Distortion-Free Image Mosaicing for Tunnel Inspection Based on Robust Cylindrical Surface Estimation through Structure from Motion

Chaiyasarn

Kim

Viola

et al. 2016

J. Comput. Civ. Eng.

View full text Add to dashboard Cite

A unifying resolution-independent formulation for early vision

Viola

Fitzgibbon

Cipolla

2012

View full text Add to dashboard Cite

We present a model for early vision tasks such as denoising, super-resolution, deblurring, and demosaicing. The model provides a resolution-independent representation of discrete images which admits a truly rotationally invariant prior. The model generalizes several existing approaches: variational methods, finite element methods, and discrete random fields. The primary contribution is a novel energy functional which has not previously been written down, which combines the discrete measurements from pixels with a continuous-domain world viewed through continousdomain point-spread functions. The value of the functional is that simple priors (such as total variation and generalizations) on the continous-domain world become realistic priors on the sampled images. We show that despite its apparent complexity, optimization of this model depends on just a few computational primitives, which although tedious to derive, can now be reused in many domains. We define a set of optimization algorithms which greatly overcome the apparent complexity of this model, and make possible its practical application. New experimental results include infinite-resolution upsampling, and a method for obtaining "subpixel superpixels".

show abstract

Image mosaicing via quadric surface estimation with priors for tunnel inspection

Chaiyasarn

Kim

Viola

et al. 2009

View full text Add to dashboard Cite

In this paper, a system which constructs a mosaic image of the tunnel surface with little distortion is presented. The tunnel surface is typically composed of a roughly cylindrical surface and protuberant regions containing objects such as pipes, pans and tunnel ridges. Since the true surface is neither planar nor quadric, existing mosaicing methods, which assume either homography or quadratic motion models, suffer from distortion. The proposed system obtains a sparse 3D model of the tunnel by multi-view reconstruction. Then, the Support Vector Machine (SVM) classifier is applied in order to separate image features lying on the cylindrical surface from those of the non-surface. The reconstructed 3D points are reprojected into images to retrieve the priors given by the SVM classifier for accurate cylindrical surface estimation. The final mosaic image is obtained by flattening the estimated textured surface onto a plane. The results suggest that the mosaic quality depends critically on the surface estimation accuracy and the proposed system is able to produce the mosaic image that preserves all physical sense, e.g. line parallelism and straightness, which is important for tunnel inspection.

show abstract

Roto++

Viola

Starck

et al. 2016

ACM Trans. Graph.

View full text Add to dashboard Cite

Figure 1: An example of our new Roto++ tool working with a professional artist to increase productivity. The artist has already specified a number of keyframes but is not satisfied with one of the intermediate frames. Under standard baselines, correcting the erroneous curve requires moving the individual control points of the spline. Using our new shape model we are able to provide an Intelligent Drag Tool that can generate likely shapes given the other keyframes. In our new interaction, the user simply selects the incorrect points and drags them all towards the correct shape. Our shape model then correctly proposes the new control point locations, allowing the correction to be performed in a single operation.

show abstract

Errata for “Distortion-Free Image Mosaicing for Tunnel Inspection Based on Robust Cylindrical Surface Estimation through Structure from Motion” by Krisada Chaiyasarn, Tae-Kyun Kim, Fabio Viola, Roberto Cipolla, and Kenichi Soga

Chaiyasarn

Kim

Viola

et al. 2016

J. Comput. Civ. Eng.

View full text Add to dashboard Cite

A Battery-Free Wireless Smart Sensor platform with Bluetooth Low Energy Connectivity for Smart Agriculture

Rosa

Dehollain

Costanza

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Fabio Viola

Vector-based navigation using grid-like representations in artificial agents

Neural scene representation and rendering

Distortion-Free Image Mosaicing for Tunnel Inspection Based on Robust Cylindrical Surface Estimation through Structure from Motion

A unifying resolution-independent formulation for early vision

Image mosaicing via quadric surface estimation with priors for tunnel inspection

Roto++

Errata for “Distortion-Free Image Mosaicing for Tunnel Inspection Based on Robust Cylindrical Surface Estimation through Structure from Motion” by Krisada Chaiyasarn, Tae-Kyun Kim, Fabio Viola, Roberto Cipolla, and Kenichi Soga

A Battery-Free Wireless Smart Sensor platform with Bluetooth Low Energy Connectivity for Smart Agriculture

Contact Info

Product

Resources

About