2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021
DOI: 10.1109/iccv48922.2021.01477
|View full text |Cite
|
Sign up to set email alerts
|

Exploring Temporal Coherence for More General Video Face Forgery Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
32
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 131 publications
(48 citation statements)
references
References 39 publications
0
32
0
Order By: Relevance
“…Table 1 shows results obtained by RealForensics on each manipulation type in the FF++ dataset after training on the remaining types. Our detector works on par with the state-of-the-art without (1) using auxiliary labelled supervision [52], (2) heavily constraining the network by freezing large parts [52] or removing spatial convolutions [112], nor (3) using audio at test-time [115]. We also outperform the baseline of training a CSN [98] network on the forgery data (with the same augmentations as RealForensics), indicating the effectiveness of leveraging real data using our approach.…”
Section: Cross-manipulation Generalisationmentioning
confidence: 88%
See 4 more Smart Citations
“…Table 1 shows results obtained by RealForensics on each manipulation type in the FF++ dataset after training on the remaining types. Our detector works on par with the state-of-the-art without (1) using auxiliary labelled supervision [52], (2) heavily constraining the network by freezing large parts [52] or removing spatial convolutions [112], nor (3) using audio at test-time [115]. We also outperform the baseline of training a CSN [98] network on the forgery data (with the same augmentations as RealForensics), indicating the effectiveness of leveraging real data using our approach.…”
Section: Cross-manipulation Generalisationmentioning
confidence: 88%
“…Unlike our method, it requires a large-scale labelled dataset and focuses exclusively on the mouth region. Very recently, [112] report high generalisation by reducing the spatial kernel sizes of convolutional layers to 1, thus learning temporal inconsistencies while ignoring spatial ones. By contrast, we target spatiotemporal irregularities that may be more consistent with human perception of forgery cues.…”
Section: Face Forgery Detectionmentioning
confidence: 99%
See 3 more Smart Citations