Face-Focused Cross-Stream Network for Deception Detection in Videos

Ding, Mingyu; Zhao, An; Lu, Zhiwu; Xiang, Tao; Wen, Ji-Rong

doi:10.1109/cvpr.2019.00799

Cited by 40 publications

(28 citation statements)

References 53 publications

(129 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our unsupervised visual SA model performed comparably to existing fully-supervised, automated approaches that used the same dataset (ACC 75% [11], 77% [13], 79% [12]; AUC 70% [10]). While some prior fully-supervised, automated approaches [21,15,20] outperformed our unsupervised SA, our findings support the potential for introducing unsupervised SA to address the data scarcity problem of modeling high-stakes deception. Our results support our hypothesis that audio-visual representations of low-stakes deception in lab-controlled situations can be leveraged by SA to detect high-stakes deception in real-world situations.…”

Section: Resultssupporting

confidence: 60%

Unsupervised Audio-Visual Subspace Alignment for High-Stakes Deception Detection

Mathur

Matarić²

2021

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

Automated systems that detect deception in high-stakes situations can enhance societal well-being across medical, social work, and legal domains. Existing models for detecting highstakes deception in videos have been supervised, but labeled datasets to train models can rarely be collected for most realworld applications. To address this problem, we propose the first multimodal unsupervised transfer learning approach that detects real-world, high-stakes deception in videos without using high-stakes labels. Our subspace-alignment (SA) approach adapts audio-visual representations of deception in lab-controlled low-stakes scenarios to detect deception in real-world, high-stakes situations. Our best unsupervised SA models outperform models without SA, outperform human ability, and perform comparably to a number of existing supervised models. Our research demonstrates the potential for introducing subspace-based transfer learning to model highstakes deception and other social behaviors in real-world contexts with a scarcity of labeled behavioral data.

show abstract

Section: Resultssupporting

confidence: 60%

Unsupervised Audio-Visual Subspace Alignment for High-Stakes Deception Detection

Mathur

Matarić²

2021

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

show abstract

“…In addition, Soldner et al [32] introduced dialogue features, consisting of interaction cues. Other multi-modal approaches combined the previously mention verbal and non-verbal features together with micro-expressions [3][4][5], thermal imaging [33], or spatio-temporal features extracted from 3D CNNs [34,35].…”

Section: Related Workmentioning

confidence: 99%

“…In the last decade, there has been a growing interest in the use of facial images to perform lie detection, often based on micro-expressions [3][4][5]13,15] or facial action units [14], achieving the current state-of-the-art accuracy. Table 1 below shows an overview of the major related works outlined in this section.…”

Section: Related Workmentioning

confidence: 99%

“…It is considered hard for humans to detect when someone is lying. Ekman [1] highlights five reasons to explain why it is so difficult for us: (1) during most of human history, there were smaller societies in which liars would have had more chances of being caught with worse consequences than nowadays; (2) children are not taught how to detect lies since even their parents want to hide some things from them; (3) people prefer to trust in what they are told; (4) people prefer not to know the real truth; and (5) people are taught to be polite and not steal information that is not given. However, it has been argued that it is possible for someone to learn how to detect lies in another person given sufficient feedback (e.g., that 50% of the time, that person is lying) and focusing on micro-expressions [1,2].…”

Section: Introductionmentioning

confidence: 99%

“…Building from the above, the detection of deceptive behavior using facial analysis has been proved feasible using macro-and, especially, micro-expressions [3][4][5]. However, micro-expressions are difficult to capture at standard frame rates and, given that humans can learn how to spot them to perform lie detection, the same training might be used by liars to learn how to hide them.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Machine Learning-Based Lie Detector Applied to a Novel Annotated Game Dataset

et al. 2021

View full text Add to dashboard Cite

Lie detection is considered a concern for everyone in their day-to-day life, given its impact on human interactions. Thus, people normally pay attention to both what their interlocutors are saying and to their visual appearance, including the face, to find any signs that indicate whether or not the person is telling the truth. While automatic lie detection may help us to understand these lying characteristics, current systems are still fairly limited, partly due to lack of adequate datasets to evaluate their performance in realistic scenarios. In this work, we collect an annotated dataset of facial images, comprising both 2D and 3D information of several participants during a card game that encourages players to lie. Using our collected dataset, we evaluate several types of machine learning-based lie detectors in terms of their generalization, in person-specific and cross-application experiments. We first extract both handcrafted and deep learning-based features as relevant visual inputs, then pass them into multiple types of classifier to predict respective lie/non-lie labels. Subsequently, we use several metrics to judge the models’ accuracy based on the models predictions and ground truth. In our experiment, we show that models based on deep learning achieve the highest accuracy, reaching up to 57% for the generalization task and 63% when applied to detect the lie to a single participant. We further highlight the limitation of the deep learning-based lie detector when dealing with cross-application lie detection tasks. Finally, this analysis along the proposed datasets would potentially be useful not only from the perspective of computational systems perspective (e.g., improving current automatic lie prediction accuracy), but also for other relevant application fields, such as health practitioners in general medical counselings, education in academic settings or finance in the banking sector, where close inspections and understandings of the actual intentions of individuals can be very important.

show abstract