The Speaker and Language Recognition Workshop (Odyssey 2018) 2018
DOI: 10.21437/odyssey.2018-34
|View full text |Cite
|
Sign up to set email alerts
|

Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama’s voice using GAN, WaveNet and low-quality found data

Abstract: Thanks to the growing availability of spoofing databases and rapid advances in using them, systems for detecting voice spoofing attacks are becoming more and more capable, and error rates close to zero are being reached for the ASVspoof2015 database. However, speech synthesis and voice conversion paradigms that are not considered in the ASVspoof2015 database are appearing.Such examples include direct waveform modelling and generative adversarial networks. We also need to investigate the feasibility of training… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
37
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
3

Relationship

4
5

Authors

Journals

citations
Cited by 59 publications
(37 citation statements)
references
References 21 publications
(29 reference statements)
0
37
0
Order By: Relevance
“…Other risks could include fabricating a 'digital clone' of someone using machine learning -recent warning examples are provided by the so-called deepfakes [26,27,28], realistic-appearing but fabricated or tampered videos portraying a targeted person created with the aid of deep learning (the interested reader is pointed to to [29] for a detailed review of potential societal, ethical and legal implications of deepfakes). In the context of speaker verification in specific, [30] addressed voice cloning of a well-known celebrity (the former US president Barack Obama). Even if the result was essentially negative (the cloned voice samples were detectable as artificial ones using a spoofing countermeasure), machine learning, including voice cloning techniques, do not stand still.…”
Section: Attacks On Speaker Verification Systems With Found Datamentioning
confidence: 99%
“…Other risks could include fabricating a 'digital clone' of someone using machine learning -recent warning examples are provided by the so-called deepfakes [26,27,28], realistic-appearing but fabricated or tampered videos portraying a targeted person created with the aid of deep learning (the interested reader is pointed to to [29] for a detailed review of potential societal, ethical and legal implications of deepfakes). In the context of speaker verification in specific, [30] addressed voice cloning of a well-known celebrity (the former US president Barack Obama). Even if the result was essentially negative (the cloned voice samples were detectable as artificial ones using a spoofing countermeasure), machine learning, including voice cloning techniques, do not stand still.…”
Section: Attacks On Speaker Verification Systems With Found Datamentioning
confidence: 99%
“…Recent research on end-to-end text-to-speech (TTS) [1,2,3,4,5,6] has gained success in terms of human-like and highquality generated speech. Moreover, with regard to cloning prosody style or speaker characteristics, end-to-end TTS systems also demonstrate a powerful capability [7,8,9,10,11]. However, training end-to-end TTS systems requires large quantities of text-audio paired data.…”
Section: Introductionmentioning
confidence: 99%
“…In line with the recent EU's General Data Protection Regulation (GDPR), intended to protect the privacy of its citizens, it is important to assess risks associated with multimedia data in the public domain. A recent study [17] has attempted voice cloning of celebrity voices based on found data using a pre-defined target speaker. The cloned voice samples were, however, detectable as spoofed speech.…”
Section: Introductionmentioning
confidence: 99%