Kinship verification from facial appearance is a difficult problem. This paper explores the possibility of employing facial expression dynamics in this problem. By using features that describe facial dynamics and spatio-temporal appearance over smile expressions, we show that it is possible to improve the state of the art in this problem, and verify that it is indeed possible to recognize kinship by resemblance of facial expressions. The proposed method is tested on different kin relationships. On the average, 72.89% verification accuracy is achieved on spontaneous smiles.
Estimating the age of a human from the captured images of his/her face is a challenging problem. In general, the existing approaches to this problem use appearance features only. In this paper, we show that in addition to appearance information, facial dynamics can be leveraged in age estimation. We propose a method to extract and use dynamic features for age estimation, using a person's smile. Our approach is tested on a large, gender-balanced database with 400 subjects, with an age range between 8 and 76. In addition, we introduce a new database on posed disgust expressions with 324 subjects in the same age range, and evaluate the reliability of the proposed approach when used with another expression. State-of-the-art appearance-based age estimation methods from the literature are implemented as baseline. We demonstrate that for each of these methods, the addition of the proposed dynamic features results in statistically significant improvement. We further propose a novel hierarchical age estimation architecture based on adaptive age grouping. We test our approach extensively, including an exploration of spontaneous versus posed smile dynamics, and gender-specific age estimation. We show that using spontaneity information reduces the mean absolute error by up to 21%, advancing the state of the art for facial age estimation.
Explainability and interpretability are two critical aspects of decision support systems. Within computer vision, they are critical in certain tasks related to human behavior analysis such as in health care applications. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in the context of computer vision with an emphasis on looking at people tasks. Specifically, we review and study those mechanisms in the context of first impressions analysis. To the best of our knowledge, this is the first effort in this direction. Additionally, we describe a challenge we organized on explainability in first impressions analysis from video. We analyze in detail the newly introduced data set, evaluation protocol, proposed solutions and summarize the results of the challenge. Finally, derived from our study, we outline research opportunities that we foresee will be decisive in the near future for the development of the explainable computer vision field.Keywords Explainable computer vision · First impressions · Personality analysis · Multimodal information · Algorithmic accountability 1 IntroductionLooking at People (LaP) -the field of research focused on the visual analysis of human behavior -has been a very active research field within computer vision in the last decade [28,29,62]. Initially, LaP focused on tasks associated with basic human behaviors that were obviously visual (e.g., basic gesture recognition [71,70] or face recognition in restricted scenarios [10,83]). Research progress in LaP has now led to models that can solve those initial tasks relatively easily [66,82]. Instead, attention on human behavior analysis has now turned to problems that are not visually evident to model / recognize [84,48,72]. For instance, consider the task of assessing personality traits from visual information [72]. Although there are methods that can estimate apparent personality traits with (relatively) acceptable performance, model recommendations by themselves are useless if the end user is not confident on the model's reasoning, as the primary use for such estimation is to understand bias in human assessors.Explainability and interpretability are thus critical features of decision support systems in some LaP tasks [26]. The former focuses on mechanisms that can tell what is the rationale behind the decision or recommendation made by
Esta es la versión de autor del artículo publicado en: This is an author produced version of a paper published in:IEEE transactions on information forensics and security 4.4, (2009) Abstract-Automatically verifying the identity of a person by means of biometrics (e.g., face and fingerprint) is an important application in our day-to-day activities such as accessing banking services and security control in airports. To increase the system reliability, several biometric devices are often used. Such a combined system is known as a multimodal biometric system. This paper reports a benchmarking study carried out within the framework the Biosecure DS2 (Access Control) evaluation campaign organized by the University of Surrey, involving face, fingerprint and iris biometrics for person authentication, targeting the application of physical access control in a mediumsize establishment with some 500 persons. While multimodal biometrics is a well investigated subject in the literature, there exists no benchmark for a fusion algorithm comparison. Working towards this goal, we designed two sets of experiments: qualitydependent and cost-sensitive evaluation. The quality-dependent evaluation aims at assessing how well fusion algorithms can perform under changing quality of raw biometric images principally due to change of devices. The cost-sensitive evaluation, on the other hand, investigates how well a fusion algorithm can perform given restricted computation and in the presence of software and hardware failures, resulting in errors such as failure to acquire and failure to match. Since multiple capturing devices are available, a fusion algorithm should be able to handle this non-ideal but nevertheless realistic scenario. In both evaluations, each fusion algorithm is provided with scores from each biometric comparison subsystem as well as the quality measures of both the template and the query data. The response to the call of the evaluation campaign proved very encouraging, with the submission of 22 fusion systems. To the best of our knowledge, this campaign is the first attempt to benchmark quality-based multimodal fusion algorithms. In the presence of changing image quality which may be due to a change of acquisition devices and/or device capturing configurations, we observe that the top performing fusion algorithms are those that exploit automatically derived quality measurements. Our evaluation also suggests that while using all the available biometric sensors can definitely increase the fusion performance, this comes at the expense of increased cost in terms of acquisition time, computation time, the physical cost of hardware and its maintenance cost. As demonstrated in our experiments, a promising solution which minimizes the composite cost is sequential fusion, where a fusion algorithm sequentially uses match scores until a desired confidence is reached, or until all the match scores are exhausted, before outputting the final combined score. Index Terms-multimodal biometric authentication, biometric database, quality-...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.