A Coarse-to-Fine Adaptive Network for Appearance-Based Gaze Estimation

Cheng, Yihua; Huang, Sihao; Wang, Fei; Qian, Chen; Lu, Feng

doi:10.1609/aaai.v34i07.6636

Cited by 112 publications

(88 citation statements)

References 26 publications

(37 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…recent studies propose to use attention mechanism for fusing two eye features. Cheng et al [49] argue that the weights of two eye features are determined by face images due to the specific task in [49], so they assign weights with the guidance of facial features. Bao et al [50] propose a self-attention mechanism to fuse two eye features.…”

Section: A Deep Feature From Appearancementioning

confidence: 99%

See 1 more Smart Citation

Appearance-based Gaze Estimation With Deep Learning: A Review and Benchmark

Cheng,

Wang,

Bao

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Gaze estimation reveals where a person is looking. It is an important clue for understanding human intention. The recent development of deep learning has revolutionized many computer vision tasks, the appearance-based gaze estimation is no exception. However, it lacks a guideline for designing deep learning algorithms for gaze estimation tasks. In this paper, we present a comprehensive review of the appearance-based gaze estimation methods with deep learning. We summarize the processing pipeline and discuss these methods from four perspectives: deep feature extraction, deep neural network architecture design, personal calibration as well as device and platform. Since the data pre-processing and post-processing methods are crucial for gaze estimation, we also survey face/eye detection method, data rectification method, 2D/3D gaze conversion method, and gaze origin conversion method. To fairly compare the performance of various gaze estimation approaches, we characterize all the publicly available gaze estimation datasets and collect the code of typical gaze estimation algorithms. We implement these codes and set up a benchmark of converting the results of different methods into the same evaluation metrics. This paper not only serves as a reference to develop deep learning-based gaze estimation methods but also a guideline for future gaze estimation research. Implemented methods and data processing codes are available at http://phi-ai.org/GazeHub.

show abstract

Section: A Deep Feature From Appearancementioning

confidence: 99%

“…These two rotations are aggregated into a gaze vector through a gaze transformation layer. Cheng et al [49] propose a coarse-tofine gaze estimation method. They first use a CNN to extract facial features from face images and estimate a basic gaze direction, then they refine the basic gaze direction using eye features.…”

Section: A Deep Feature From Appearancementioning

confidence: 99%

Appearance-based Gaze Estimation With Deep Learning: A Review and Benchmark

Cheng,

Wang,

Bao

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Gaze prediction methods based on user input can be categorized as model-based and appearance-based approaches [142]. Model-based methods fit geometric eye models, detecting eye features using dedicated devices.…”

Section: Gaze Predictionmentioning

confidence: 99%

“…A sub-module of the asymmetric regression network (AR-Net) uses a new asymmetric strategy to estimate both eyes' 3D gaze directions, and a sub-module of the evaluation network (E-Net) evaluates the two eyes' performance to adjust the strategy adaptively during the optimization process. Furthermore, Cheng et al [142] constructed a coarse-to-fine adaptive network named CA-Net. This architecture uses face images to estimate gaze direction, and then predicts corresponding residuals from eye images to refine gaze direction (see Fig.…”

Section: Gaze Predictionmentioning

confidence: 99%

See 1 more Smart Citation

VR content creation and exploration with deep learning: A survey

Wang

Lyu

et al. 2020

Comp. Visual Media

View full text Add to dashboard Cite

Virtual reality (VR) offers an artificial, computer generated simulation of a real life environment. It originated in the 1960s and has evolved to provide increasing immersion, interactivity, imagination, and intelligence. Because deep learning systems are able to represent and compose information at various levels in a deep hierarchical fashion, they can build very powerful models which leverage large quantities of visual media data. Intelligence of VR methods and applications has been significantly boosted by the recent developments in deep learning techniques. VR content creation and exploration relates to image and video analysis, synthesis and editing, so deep learning methods such as fully convolutional networks and general adversarial networks are widely employed, designed specifically to handle panoramic images and video and virtual 3D scenes. This article surveys recent research that uses such deep learning methods for VR content creation and exploration. It considers the problems involved, and discusses possible future directions in this active and emerging research area. Keywords virtual reality; deep learning; neural networks; 360 • image and video virtual content

show abstract

PerimetryNet: A multiscale fine grained deep network for three‐dimensional eye gaze estimation using visual field analysis

Wang

Zhou

et al. 2023

Computer Animation & Virtual

View full text Add to dashboard Cite

Three‐dimensional gaze estimation aims to reveal where a person is looking, which plays an important role in identifying users' point‐of‐interest in terms of the direction, attention and interactions. Appearance‐based gaze estimation methods could provide relatively unconstrained gaze tracking from commodity hardware. Inspired by medical perimetry test, we have proposed a multiscale framework with visual field analysis branch to improve estimation accuracy. The model is based on the feature pyramids and predicts vision field to help gaze estimation. In particular, we analysis the effect of the multiscale component and the visual field branch on challenging benchmark datasets: MPIIGaze and EYEDIAP. Based on these studies, our proposed PerimetryNet significantly outperforms state‐of‐the‐art methods. In addition, the multiscale mechanism and visual field branch can be easily applied to existing network architecture for gaze estimation. Related code would be available at public repository https://github.com/gazeEs/PerimetryNet.

show abstract

A Coarse-to-Fine Adaptive Network for Appearance-Based Gaze Estimation

Cited by 112 publications

References 26 publications

Appearance-based Gaze Estimation With Deep Learning: A Review and Benchmark

Appearance-based Gaze Estimation With Deep Learning: A Review and Benchmark

VR content creation and exploration with deep learning: A survey

PerimetryNet: A multiscale fine grained deep network for three‐dimensional eye gaze estimation using visual field analysis

Contact Info

Product

Resources

About