Remote eye tracking technology has suffered an increasing growth in recent years due to its applicability in many research areas. In this paper, a video-oculography method based on convolutional neural networks (CNNs) for pupil center detection over webcam images is proposed. As the first contribution of this work and in order to train the model, a pupil center manual labeling procedure of a facial landmark dataset has been performed. The model has been tested over both real and synthetic databases and outperforms state-of-the-art methods, achieving pupil center estimation errors below the size of a constricted pupil in more than 95% of the images, while reducing computing time by a 8 factor. Results show the importance of use high quality training data and well-known architectures to achieve an outstanding performance.
Theory shows that huge amount of labelled data are needed in order to achieve reliable classification/regression methods when using deep/machine learning techniques. However, in the eye tracking field, manual annotation is not a feasible option due to the wide variability to be covered. Hence, techniques devoted to synthesizing images show up as an opportunity to provide vast amounts of annotated data. Considering that the well-known UnityEyes tool provides a framework to generate single eye images and taking into account that both eyes information can contribute to improve gaze estimation accuracy we present U2Eyes dataset, that is publicly available. It comprehends about 6 million of synthetic images containing binocular data. Furthermore, the physiology of the eye model employed is improved, simplified dynamics of binocular vision are incorporated and more detailed 2D and 3D labelled data are provided. Additionally, an example of application of the dataset is shown as work in progress. Employing U2Eyes as training framework Supervised Descent Method (SDM) is used for eyelids segmentation. The model obtained as result of the training process is then applied on real images from GI4E dataset showing promising results.
In this paper, we evaluate a synthetic framework to be used in the field of gaze estimation employing deep learning techniques. The lack of sufficient annotated data could be overcome by the utilization of a synthetic evaluation framework as far as it resembles the behavior of a real scenario. In this work, we use U2Eyes synthetic environment employing I2Head datataset as real benchmark for comparison based on alternative training and testing strategies. The results obtained show comparable average behavior between both frameworks although significantly more robust and stable performance is retrieved by the synthetic images. Additionally, the potential of synthetically pretrained models in order to be applied in user's specific calibration strategies is shown with outstanding performances.
Subject calibration has been demonstrated to improve the accuracy in high-performance eye trackers. However, the true weight of calibration in off-the-shelf eye tracking solutions is still not addressed. In this work, a theoretical framework to measure the effects of calibration in deep learning-based gaze estimation is proposed for low-resolution systems. To this end, features extracted from the synthetic U2Eyes dataset are used in a fully connected network in order to isolate the effect of specific user’s features, such as kappa angles. Then, the impact of system calibration in a real setup employing I2Head dataset images is studied. The obtained results show accuracy improvements over 50%, probing that calibration is a key process also in low-resolution gaze estimation scenarios. Furthermore, we show that after calibration accuracy values close to those obtained by high-resolution systems, in the range of 0.7∘, could be theoretically obtained if a careful selection of image features was performed, demonstrating significant room for improvement for off-the-shelf eye tracking systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.