2022
DOI: 10.1109/tcsvt.2021.3057457
|View full text |Cite
|
Sign up to set email alerts
|

Appearance Matters, So Does Audio: Revealing the Hidden Face via Cross-Modality Transfer

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(4 citation statements)
references
References 65 publications
0
4
0
Order By: Relevance
“…Plus, Kong et al (2021) suggested a cross‐modality approach that uses visual and auditory information to discover the hidden face behind the deepfake material. Every recovered audio segment is matched with all false faces retrieved from the related deepfake video, and each fake face is coupled with one matching ground truth face.…”
Section: Deep Fake Detection Mechanismsmentioning
confidence: 99%
“…Plus, Kong et al (2021) suggested a cross‐modality approach that uses visual and auditory information to discover the hidden face behind the deepfake material. Every recovered audio segment is matched with all false faces retrieved from the related deepfake video, and each fake face is coupled with one matching ground truth face.…”
Section: Deep Fake Detection Mechanismsmentioning
confidence: 99%
“…Dai et al [63] propose attentional local contrastive learning to capture local forgery information. Recent works [64]- [66] propose to learn consistency across different modalities, which boosts detectors to capture abundant forgery clues. However, finegrained features for general face forgery detection have been largely underexplored.…”
Section: B Face Forgery Detection Via Representation Learningmentioning
confidence: 99%
“…This question has recently aroused much research interest [2,17,20,26,27,36,38,39,41,45], which we refer to as voice-face association learning (VFAL). This emerging research field has several promising applications, such as audio-visual speaker recognition [35], criminal profiling, speaker tracking [26], and deepfake video detection [22].…”
Section: Introductionmentioning
confidence: 99%