2023
DOI: 10.3389/fdata.2022.1001063
|View full text |Cite
|
Sign up to set email alerts
|

Audio deepfakes: A survey

Abstract: A deepfake is content or material that is synthetically generated or manipulated using artificial intelligence (AI) methods, to be passed off as real and can include audio, video, image, and text synthesis. The key difference between manual editing and deepfakes is that deepfakes are AI generated or AI manipulated and closely resemble authentic artifacts. In some cases, deepfakes can be fabricated using AI-generated content in its entirety. Deepfakes have started to have a major impact on society with more gen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(4 citation statements)
references
References 117 publications
0
0
0
Order By: Relevance
“…Challenges and threats posed by audio deepfakes. While voice cloning and audio deepfake technologies hold significant promise for a range of professional applications, from automated news reporting to personalized content creation, they are beset with challenges that span ethical considerations, computational demands, linguistic diversity, and the intricacies of human speech (Khanjani et al, 2023).…”
Section: Continuation Of the Tablementioning
confidence: 99%
See 1 more Smart Citation
“…Challenges and threats posed by audio deepfakes. While voice cloning and audio deepfake technologies hold significant promise for a range of professional applications, from automated news reporting to personalized content creation, they are beset with challenges that span ethical considerations, computational demands, linguistic diversity, and the intricacies of human speech (Khanjani et al, 2023).…”
Section: Continuation Of the Tablementioning
confidence: 99%
“…TTS systems often struggle with homographs-words spelled the same but with different meanings-leading to incorrect pronunciations in context. Furthermore, recognizing periods, special characters, and the nuances of human speech elements like breathing, laughter, and pauses remains challenging, deviating from synthesized speech's human-like quality (Khanjani et al, 2023).…”
Section: Continuation Of the Tablementioning
confidence: 99%
“…Deepfake refers to synthetic information or materials that have been developed or altered using artificial intelligence (AI) technologies, and are intended to be considered authentic. These may include audio, video, picture, and text synthesis [1].…”
Section: Introductionmentioning
confidence: 99%
“…Firstly, there's Faceswap, widely popularized by Snapchat, allowing users to modify facial features in photographs for playful transformations . [1]. Secondly, Synthesis techniques, powered by generative adversarial networks (GANs), have revolutionized image creation, with models like NVIDIA 112 generating countless variations of images [26].…”
Section: Introductionmentioning
confidence: 99%