In this paper we propose a method for end-to-end speech driven video editing using a denoising diffusion model. Given a video of a person speaking, we aim to re-synchronise the lip and jaw motion of the person in response to a separate auditory speech recording without relying on intermediate structural representations such as facial landmarks or a 3D face model. We show this is possible by conditioning a denoising diffusion model with audio spectral features to generate synchronised facial motion. We achieve convincing results on the task of unstructured single-speaker video editing, achieving a word error rate of 45% using an off the shelf lip reading model. We further demonstrate how our approach can be extended to the multi-speaker domain. To our knowledge, this is the first work to explore the feasibility of applying denoising diffusion models to the task of audio-driven video editing. 1
Smart Cities" are a viable option to various issues caused by accelerated urban growth. To make smart cities a reality, smart citizens need to be connected to the "Smart City" through a digital ID. A digital ID allows citizens to make easy and effective use of amenities provided by the smart city such as healthcare, transport, finance, and energy. In this paper, we propose a proof-of-concept of facial authentication based end-to-end digital ID system for a smart city. Facial authentication systems are prone to various biometric template attacks and cyber security attacks. Our proposed system is designed to detect the first type of attack especially deepfake and presentation attacks. Users are authenticated each time before using facilities in the smart city. Authentication is performed on edge devices, making the process more secure as no data leaves the device during authentication. Facial data is stored at the cloud in a look up table format with an unidentifiable username. Our proposed solution achieved 97% accuracy in authentication with a False Rejection Ratio of 2% and False Acceptance Ratio of 3%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.