Forensic facial identification examiners are required to match the identity of faces in images that vary substantially, owing to changes in viewing conditions and in a person's appearance. These identifications affect the course and outcome of criminal investigations and convictions. Despite calls for research on sources of human error in forensic examination, existing scientific knowledge of face matching accuracy is based, almost exclusively, on people without formal training. Here, we administered three challenging face matching tests to a group of forensic examiners with many years' experience of comparing face images for law enforcement and government agencies. Examiners outperformed untrained participants and computer algorithms, thereby providing the first evidence that these examiners are experts at this task. Notably, computationally fusing responses of multiple experts produced near-perfect performance. Results also revealed qualitative differences between expert and non-expert performance. First, examiners' superiority was greatest at longer exposure durations, suggestive of more entailed comparison in forensic examiners. Second, experts were less impaired by image inversion than nonexpert students, contrasting with face memory studies that show larger face inversion effects in high performers. We conclude that expertise in matching identity across unfamiliar face images is supported by processes that differ qualitatively from those supporting memory for individual faces.
SignificanceThis study measures face identification accuracy for an international group of professional forensic facial examiners working under circumstances that apply in real world casework. Examiners and other human face “specialists,” including forensically trained facial reviewers and untrained superrecognizers, were more accurate than the control groups on a challenging test of face identification. Therefore, specialists are the best available human solution to the problem of face identification. We present data comparing state-of-the-art face recognition technology with the best human face identifiers. The best machine performed in the range of the best humans: professional facial examiners. However, optimal face identification was achieved only when humans and machines worked in collaboration.
We introduce four principles for explainable artificial intelligence (AI) that comprise fundamental properties for explainable AI systems. We propose that explainable AI systems deliver accompanying evidence or reasons for outcomes and processes; provide explanations that are understandable to individual users; provide explanations that correctly reflect the system's process for generating the output; and that a system only operates under conditions for which it was designed and when it reaches sufficient confidence in its output. We have termed these four principles as explanation, meaningful, explanation accuracy, and knowledge limits, respectively. Through significant stakeholder engagement, these four principles were developed to encompass the multidisciplinary nature of explainable AI, including the fields of computer science, engineering, and psychology. Because one-sizefits-all explanations do not exist, different users will require different types of explanations. We present five categories of explanation and summarize theories of explainable AI. We give an overview of the algorithms in the field that cover the major classes of explainable algorithms. As a baseline comparison, we assess how well explanations provided by people follow our four principles. This assessment provides insights to the challenges of designing explainable AI systems.
Person recognition often unfolds over time and distance as a person approaches, with the quality of identity information from faces, bodies, and motion in constant flux. Participants were familiarized with identities using close-up and distant videos. Recognition was tested with videos of people approaching from a distance. We varied the timing of prompted responses in the test videos, the amount of video seen, and whether the face, body, or whole person was visible. A free response condition was also included to allow participants to respond when they felt 'confident'. The pattern of accuracy across conditions indicated that recognition judgments were based on the most recently available information, with no contribution from qualitatively diverse and statistically useful person cues available earlier in the video. Body recognition was stable across viewing distance, whereas face recognition improved with proximity. The body made an independent contribution to recognition only at the farthest distance tested. Free response latencies indicated meta-knowledge of the optimal proximity for recognition from faces versus bodies. Notably, response bias varied strongly as a function of participants' expectation about whether closer proximity video was forthcoming. These findings lay the groundwork for developing person recognition theories that generalize to natural viewing environments.
Face identification is more accurate when people collaborate in social dyads than when they work alone (Dowsett & Burton, 2015, Br. J. Psychol., 106, 433). Identification accuracy is also increased when the responses of two people are averaged for each item to create a 'non-social' dyad (White, Burton, Kemp, & Jenkins, 2013, Appl. Cogn. Psychol., 27, 769; White et al., 2015, Proc. R. Soc. B Biol. Sci., 282, 20151292). Does social collaboration add to the benefits of response averaging for face identification? We compared individuals, social dyads, and non-social dyads on an unfamiliar face identity-matching test. We also simulated non-social collaborations for larger groups of people. Individuals and social dyads judged whether face image pairs depicted the same- or different identities, responding on a 5-point certainty scale. Non-social dyads were constructed by averaging the responses of paired individuals. Both social and non-social dyads were more accurate than individuals. There was no advantage for social over non-social dyads. For larger non-social groups, performance peaked at near perfection with a crowd size of eight participants. We tested three computational models of social collaboration and found that social dyad performance was predicted by the decision of the more accurate partner. We conclude that social interaction does not bolster accuracy for unfamiliar face identity matching in dyads beyond what can be achieved by averaging judgements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.