Recent advances in visual media technology have led to new tools for processing and, above all, generating multimedia contents. In particular, modern AI-based technologies have provided easy-to-use tools to create extremely realistic manipulated videos. Such synthetic videos, named Deep Fakes, may constitute a serious threat to attack the reputation of public subjects or to address the general opinion on a certain event. According to this, being able to individuate this kind of fake information becomes fundamental. In this work, a new forensic technique able to discern between fake and original video sequences is given; unlike other state-of-the-art methods which resorts at single video frames, we propose the adoption of optical flow fields to exploit possible inter-frame dissimilarities. Such a clue is then used as feature to be learned by CNN classifiers. Preliminary results obtained on FaceForensics++ dataset highlight very promising performances.
Abstract-Succeeding in determining information about the origin of a digital image is a basic issue of multimedia forensics.In particular it could be interesting to individuate which is the specific camera (brand and/or model) that has taken that photo; to do that, additional knowledge are needed about the camera such as its fingerprint, usually computed by resorting at the extraction of the PRNU (Photo-Response-Uniformity-Noise) by using a group of images coming from that camera. It is easy to understand that in many application scenarios information at disposal are very limited; this is the case when, given a set of N images, we want to establish if they belong to M different cameras where M is less or, at most, equal to N, without having any knowledge about the source cameras. In this paper a new technique which aims at blindly clustering a given set of N digital images is presented. Such a technique is based on a pre-existing one [1] and improves it both in terms of error probability and of computational efficiency. The system is able, in an unsupervised and fast manner, to group photos without any initial information about their membership. Sensor pattern noise is extracted by each image as reference and the successive classification is performed by means of a hierarchical clustering procedure. Experimental results have been carried out to verify theoretical expectations and to witness the improvements with respect to the other technique. Tests have also been done in different operative circumstances (e.g. asymmetric distribution of the images within each cluster), obtaining satisfactory results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.