Shalini De Mello scite author profile

Inter-personal anatomical differences limit the accuracy of person-independent gaze estimation networks. Yet there is a need to lower gaze errors further to enable applications requiring higher quality. Further gains can be achieved by personalizing gaze networks, ideally with few calibration samples. However, over-parameterized neural networks are not amenable to learning from few examples as they can quickly over-fit. We embrace these challenges and propose a novel framework for Few-shot Adaptive GaZE Estimation (FAZE) for learning person-specific gaze networks with very few (≤ 9) calibration samples. FAZE learns a rotation-aware latent representation of gaze via a disentangling encoder-decoder architecture along with a highly adaptable gaze estimator trained using meta-learning. It is capable of adapting to any new person to yield significant performance gains with as few as 3 samples, yielding state-of-the-art performance of 3.18 • on GazeCapture, a 19% improvement over prior art.

show abstract

Dynamic Facial Analysis: From Bayesian Filtering to Recurrent Neural Network

Gu¹,

Yang²,

Mello³

et al. 2017

View full text Add to dashboard Cite

Light-Weight Head Pose Invariant Gaze Tracking

Ranjan

Mello²,

Kautz³

2018

View full text Add to dashboard Cite

Unconstrained remote gaze tracking using off-the-shelf cameras is a challenging problem. Recently, promising algorithms for appearance-based gaze estimation using convolutional neural networks (CNN) have been proposed. Improving their robustness to various confounding factors including variable head pose, subject identity, illumination and image quality remain open problems. In this work, we study the effect of variable head pose on machine learning regressors trained to estimate gaze direction. We propose a novel branched CNN architecture that improves the robustness of gaze classifiers to variable head pose, without increasing computational cost. We also present various procedures to effectively train our gaze network including transfer learning from the more closely related task of object viewpoint estimation and from a large high-fidelity synthetic gaze dataset, which enable our ten times faster gaze network to achieve competitive accuracy to its current stateof-the-art direct competitor.

show abstract

Switchable Temporal Propagation Network

Liu

Zhong

Mello

et al. 2018

View full text Add to dashboard Cite

Videos contain highly redundant information between frames. Such redundancy has been extensively studied in video compression and encoding, but is less explored for more advanced video processing. In this paper, we propose a learnable unified framework for propagating a variety of visual properties of video images, including but not limited to color, high dynamic range (HDR), and segmentation information, where the properties are available for only a few key-frames. Our approach is based on a temporal propagation network (TPN), which models the transitionrelated affinity between a pair of frames in a purely data-driven manner. We theoretically prove two essential factors for TPN: (a) by regularizing the global transformation matrix as orthogonal, the "style energy" of the property can be well preserved during propagation; (b) such regularization can be achieved by the proposed switchable TPN with bi-directional training on pairs of frames. We apply the switchable TPN to three tasks: colorizing a gray-scale video based on a few color key-frames, generating an HDR video from a low dynamic range (LDR) video and a few HDR frames, and propagating a segmentation mask from the first frame in videos. Experimental results show that our approach is significantly more accurate and efficient than the state-of-the-art methods. All the codes and models will be released to the public.

show abstract

Self-supervised Single-view 3D Reconstruction via Semantic Consistency

Liu

Kim

et al. 2020

Preprint

View full text Add to dashboard Cite

12 3 4 5 6

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shalini De Mello

Efficient Geometry-aware 3D Generative Adversarial Networks

Self-supervised Single-View 3D Reconstruction via Semantic Consistency

GroupViT: Semantic Segmentation Emerges from Text Supervision

Few-Shot Adaptive Gaze Estimation

Dynamic Facial Analysis: From Bayesian Filtering to Recurrent Neural Network

Light-Weight Head Pose Invariant Gaze Tracking

Switchable Temporal Propagation Network

Self-supervised Single-view 3D Reconstruction via Semantic Consistency

Contact Info

Product

Resources

About