Although not commonly used, correlation filters can track complex objects through rotations, occlusions and other distractions at over 20 times the rate of current stateof-the-art techniques. The oldest and simplest correlation filters use simple templates and generally fail when applied to tracking. More modern approaches such as ASEF and UMACE perform better, but their training needs are poorly suited to tracking. Visual tracking requires robust filters to be trained from a single frame and dynamically adapted as the appearance of the target object changes. This paper presents a new type of correlation filter, a Minimum Output Sum of Squared Error (MOSSE) filter, which produces stable correlation filters when initialized using a single frame. A tracker based upon MOSSE filters is robust to variations in lighting, scale, pose, and nonrigid deformations while operating at 669 frames per second. Occlusion is detected based upon the peak-to-sidelobe ratio, which enables the tracker to pause and resume where it left off when the object reappears.
Inexpensive "point-and-shoot" camera technology has combined with social network technology to give the general population a motivation to use face recognition technology. Users expect a lot; they want to snap pictures, shoot videos, upload, and have their friends, family and acquaintances more-or-less automatically recognized. Despite the apparent simplicity of the problem, face recognition in this context is hard. Roughly speaking, in these scenarios algorithms fail to correctly recognize people as often or even more often than they succeed. In contrast, existing algorithms have become very reliable for well controlled imagery with recognition error rates down in the 1 in 1,000 range. To spur advancement in face and person recognition, this paper introduces the Point and Shoot Face Recognition Challenge (PaSC). The challenge includes 9,376 still images of 293 people balanced with respect to distance to the camera, alternative sensors, frontal versus not-frontal views, and varying location. There are also 2,802 videos for 265 people: a subset of the 293. Verification results are presented for public baseline algorithms and a commercial algorithm for three cases: comparing still images to still images, videos to videos, and still images to videos.
The Good, the Bad, & the Ugly Face Challenge Problem was created to encourage the development of algorithms that are robust to recognition across changes that occur in still frontal faces. The Good, the Bad, & the Ugly consists of three partitions. The Good partition contains pairs of images that are considered easy to recognize. On the Good partition, the base verification rate (VR) is 0.98 at a false accept rate (FAR) of 0.001. The Bad partition contains pairs of images of average difficulty to recognize. For the Bad partition, the VR is 0.80 at a FAR of 0.001. The Ugly partition contains pairs of images considered difficult to recognize, with a VR of 0.15 at a FAR of 0.001. The base performance is from fusing the output of three of the top performers in the FRVT 2006. The design of the Good, the Bad, & the Ugly controls for pose variation, subject aging, and subject "recognizability." Subject recognizability is controlled by having the same number of images of each subject in every partition. This implies that the differences in performance among the partitions are result of how a face is presented in each image.
Abstract. The CSU Face Identification Evaluation System provides standard face recognition algorithms and standard statistical methods for comparing face recognition algorithms. The system includes standardized image pre-processing software, three distinct face recognition algorithms, analysis software to study algorithm performance, and Unix shell scripts to run standard experiments. All code is written in ANSI C. The preprocessing code replicates feature of preprocessing used in the FERET evaluations. The three algorithms provided are Principle Components Analysis (PCA), a.k.a Eigenfaces, a combined Principle Components Analysis and Linear Discriminant Analysis algorithm (PCA+LDA), and a Bayesian Intrapersonal/Extrapersonal Classifier (BIC). The PCA+LDA and BIC algorithms are based upon algorithms used in the FERET study contributed by the University of Maryland and MIT respectively. There are two analysis. The first takes as input a set of probe images, a set of gallery images, and similarity matrix produced by one of the three algorithms. It generates a Cumulative Match Curve of recognition rate versus recognition rank. The second analysis tool generates a sample probability distribution for recognition rate at recognition rank 1, 2, etc. It takes as input multiple images per subject, and uses Monte Carlo sampling in the space of possible probe and gallery choices. This procedure will, among other things, add standard error bars to a Cumulative Match Curve. The System is available through our website and we hope it will be used by others to rigorously compare novel face identification algorithms to standard algorithms using a common implementation and known comparison techniques.
This paper introduces a class of correlation filters called Average of Synthetic Exact Filters (ASEF). For ASEF, the correlation output is completely specified for each training image. This is in marked contrast to prior methods such as Synthetic Discriminant Functions (SDFs) which only specify a single output value per training image. Advantages of ASEF training include: insenitivity to over-fitting, greater flexibility with regard to training images, and more robust behavior in the presence of structured backgrounds. The theory and design of ASEF filters is presented using eye localization on the FERET database as an example task. ASEF is compared to other popular correlation filters including SDF, MACE, OTF, and UMACE, and with other eye localization methods including Gabor Jets and the OpenCV Cascade Classifier. ASEF is shown to outperform all these methods, locating the eye to within the radius of the iris approximately 98.5% of the time.
The goal of the Multiple Biometrics Grand Challenge (MBGC) is to improve the performance of face and iris recognition technology from biometric samples acquired under unconstrained conditions. The MBGC is organized into three challenge problems. Each challenge problem relaxes the acquisition constraints in different directions. In the Portal Challenge Problem, the goal is to recognize people from nearinfrared (NIR) and high definition (HD) video as they walk through a portal. Iris recognition can be performed from the NIR video and face recognition from the HD video. The availability of NIR and HD modalities allows for the development of fusion algorithms. The Still Face Challenge Problem has two primary goals. The first is to improve recognition performance from frontal and off angle still face images taken under uncontrolled indoor and outdoor lighting. The second is to improve recognition performance on still frontal face images that have been resized and compressed, as is required for electronic passports. In the Video Challenge Problem, the goal is to recognize people from video in unconstrained environments. The video is unconstrained in pose, illumination, and camera angle. All three challenge problems include a large data set, experiment descriptions, ground truth, and scoring code.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.