Despite the prevalence of smart TVs, many consumers continue to use conventional TVs with supplementary set-top boxes (STBs) because of the high cost of smart TVs. However, because the processing power of a STB is quite low, the smart TV functionalities that can be implemented in a STB are very limited. Because of this, negligible research has been conducted regarding face recognition for conventional TVs with supplementary STBs, even though many such studies have been conducted with smart TVs. In terms of camera sensors, previous face recognition systems have used high-resolution cameras, cameras with high magnification zoom lenses, or camera systems with panning and tilting devices that can be used for face recognition from various positions. However, these cameras and devices cannot be used in intelligent TV environments because of limitations related to size and cost, and only small, low cost web-cameras can be used. The resulting face recognition performance is degraded because of the limited resolution and quality levels of the images. Therefore, we propose a new face recognition system for intelligent TVs in order to overcome the limitations associated with low resource set-top box and low cost web-cameras. We implement the face recognition system using a software algorithm that does not require special devices or cameras. Our research has the following four novelties: first, the candidate regions in a viewer's face are detected in an image captured by a camera connected to the STB via low processing background subtraction and face color filtering; second, the detected candidate regions of face are transmitted to a server that has high processing power in order to detect face regions accurately; third, in-plane rotations of the face regions are compensated based on similarities between the left and right half sub-regions of the face regions; fourth, various poses of the viewer's face region are identified using five templates obtained during the initial user registration stage and multi-level local binary pattern matching. Experimental results indicate that the recall; precision; and genuine acceptance rate were about 95.7%; 96.2%; and 90.2%, respectively.
Abstract. Recently, it has become necessary to evaluate the performance of display devices in terms of human factors. To meet this requirement, several studies have been conducted to measure the eyestrain of users watching display devices. However, these studies were limited in that they did not consider precise human visual information. Therefore, a new eyestrain measurement method is proposed that uses a liquid crystal display (LCD) to measure a user's gaze direction and visual field of view. Our study is different in the following four ways. First, a user's gaze position is estimated using an eyeglass-type eye-image capturing device. Second, we propose a new eye foveation model based on a wavelet transform, considering the gaze position and the gaze detection error of a user. Third, three video adjustment factors-variance of hue (VH), edge, and motion information-are extracted from the displayed images in which the eye foveation models are applied. Fourth, the relationship between eyestrain and three video adjustment factors is investigated. Experimental results show that the decrement of the VH value in a display induces a decrease in eyestrain. In addition, increased edge and motion components induce a reduction in eyestrain.
We propose a new method for measuring the degree of eyestrain on 3D stereoscopic displays using a glasses-type of eye tracking device. Our study is novel in the following four ways: first, the circular area where a user's gaze position exists is defined based on the calculated gaze position and gaze estimation error. Within this circular area, the position where edge strength is maximized can be detected, and we determine this position as the gaze position that has a higher probability of being the correct one. Based on this gaze point, the eye foveation model is defined. Second, we quantitatively evaluate the correlation between the degree of eyestrain and the causal factors of visual fatigue, such as the degree of change of stereoscopic disparity (CSD), stereoscopic disparity (SD), frame cancellation effect (FCE), and edge component (EC) of the 3D stereoscopic display using the eye foveation model. Third, by comparing the eyestrain in conventional 3D video and experimental 3D sample video, we analyze the characteristics of eyestrain according to various factors and types of 3D video. Fourth, by comparing the eyestrain with or without the compensation of eye saccades movement in 3D video, we analyze the characteristics of eyestrain according to the types of eye movements in 3D video. Experimental results show that the degree of CSD causes more eyestrain than other factors.
We propose a new remote gaze tracking system as an intelligent TV interface. Our research is novel in the following three ways: first, because a user can sit at various positions in front of a large display, the capture volume of the gaze tracking system should be greater, so the proposed system includes two cameras which can be moved simultaneously by panning and tilting mechanisms, a wide view camera (WVC) for detecting eye position and an auto-focusing narrow view camera (NVC) for capturing enlarged eye images. Second, in order to remove the complicated calibration between the WVC and NVC and to enhance the capture speed of the NVC, these two cameras are combined in a parallel structure. Third, the auto-focusing of the NVC is achieved on the basis of both the user's facial width in the WVC image and a focus score calculated on the eye image of the NVC. Experimental results showed that the proposed system can be operated with a gaze tracking accuracy of ±0.737°∼±0.775° and a speed of 5∼10 frames/s.
Gaze tracking determines what a user is looking at; the key challenge is to obtain well-focused eye images. This is not easy because the human eye is very small, whereas the required resolution of the image should be large enough for accurate detection of the pupil center. In addition, capturing a user's eye image by a remote gaze tracking system within a large working volume at a long Z distance requires a panning/tilting mechanism with a zoom lens, which makes it more difficult to acquire focused eye images. To solve this problem, a new auto-focusing method for remote gaze tracking is proposed. The proposed approach is novel in the following four ways: First, it is the first research on an auto-focusing method for a remote gaze tracking system. Second by using userdependent calibration at initial stage, the weakness of the previous methods that use facial width in captured image to estimate Z distance between a user and camera, wherein each person has the individual variation of facial width, is solved. Third, the parameters of the modeled formula for estimating the Z distance are adaptively updated using the least squares regression method. Therefore, the focus becomes more accurate over time. Fourth, the relationship between the parameters and the face width is fitted locally according to the Z distance instead of by global fitting, which can enhance the accuracy of Z distance estimation. The results of an experiment with 10,000 images of 10 persons showed that the mean absolute error between the ground-truth Z distance measured by a Polhemus Patriot device and that estimated by the proposed method was 4.84 cm. A total of 95.61% of the images obtained by the proposed method were focused and could be used for gaze detection.
Conventional gaze tracking systems are limited in cases where the user is wearing glasses because the glasses usually produce noise due to reflections caused by the gaze tracker's lights. This makes it difficult to locate the pupil and the specular reflections (SRs) from the cornea of the user's eye. These difficulties increase the likelihood of gaze detection errors because the gaze position is estimated based on the location of the pupil center and the positions of the corneal SRs. In order to overcome these problems, we propose a new gaze tracking method that can be used by subjects who are wearing glasses. Our research is novel in the following four ways: first, we construct a new control device for the illuminator, which includes four illuminators that are positioned at the four corners of a monitor. Second, our system automatically determines whether a user is wearing glasses or not in the initial stage by counting the number of white pixels in an image that is captured using the low exposure setting on the camera. Third, if it is determined that the user is wearing glasses, the four illuminators are turned on and off sequentially in order to obtain an image that has a minimal amount of noise due to reflections from the glasses. As a result, it is possible to avoid the reflections and accurately locate the pupil center and the positions of the four corneal SRs. Fourth, by turning off one of the four illuminators, only three corneal SRs exist in the captured image. Since the proposed gaze detection method requires four corneal SRs for calculating the gaze position, the unseen SR position is estimated based on the parallelogram shape that is defined by the three SR positions and the gaze position is calculated. Experimental results showed that the average gaze detection error with 20 persons was about 0.70° and the processing time is 63.72 ms per each frame.
Robust and accurate pupil detection is a prerequisite for gaze detection. Hence, we propose a new eye/pupil detection method for gaze detection on a large display. The novelty of our research can be summarized by the following four points. First, in order to overcome the performance limitations of conventional methods of eye detection, such as adaptive boosting (Adaboost) and continuously adaptive mean shift (CAMShift) algorithms, we propose adaptive selection of the Adaboost and CAMShift methods. Second, this adaptive selection is based on two parameters: pixel differences in successive images and matching values determined by CAMShift. Third, a support vector machine (SVM)‐based classifier is used with these two parameters as the input, which improves the eye detection performance. Fourth, the center of the pupil within the detected eye region is accurately located by means of circular edge detection, binarization and calculation of the geometric center. The experimental results show that the proposed method can detect the center of the pupil at a speed of approximately 19.4 frames/s with an RMS error of approximately 5.75 pixels, which is superior to the performance of conventional detection methods
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.