2021
DOI: 10.1007/978-3-030-71711-7_15
|View full text |Cite
|
Sign up to set email alerts
|

Multilingual Voice Impersonation Dataset and Evaluation

Abstract: Well-known vulnerabilities of voice-based biometrics are impersonation, replay attacks, artificial signals/speech synthesis, and voice conversion. Among these, voice impersonation is the obvious and simplest way of attack that can be performed. Though voice impersonation by amateurs is considered not a severe threat to ASV systems, studies show that professional impersonators can successfully influence the performance of the voice-based biometrics system. In this work, we have created a novel voice impersonati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 14 publications
0
6
0
Order By: Relevance
“…Alternatively, one may exploit the public datasets, like voxceleb [61] that contain the original voices of celebrities; one may still vie for human mimicked version voices of these celebrities on YouTube. A related dataset is [62].…”
Section: The Datasetmentioning
confidence: 99%
“…Alternatively, one may exploit the public datasets, like voxceleb [61] that contain the original voices of celebrities; one may still vie for human mimicked version voices of these celebrities on YouTube. A related dataset is [62].…”
Section: The Datasetmentioning
confidence: 99%
“…In [48], an entirely automated pipeline that leveraged computer vision techniques was employed to create voxceleb1 data from open-source media. Investigating voice impersonation attacks and their implications for automatic speaker verification (ASV) systems was the central focus of the paper in [49]. Furthermore, the research paper in [50] thoroughly examined the encoding layers and loss functions utilized in end-to-end speaker and language recognition systems.…”
Section: Related Workmentioning
confidence: 99%
“…Specifically, for the TIMIT dataset, we followed the data split used by [46] for 630 speakers and [42,43] for 120 speakers. Similarly, for the VoxCeleb1 dataset, we employed the same data split as described in [47][48][49][50] to ensure consistency and fairness in our comparisons. This approach allowed us to conduct meaningful evaluations while maintaining parity with existing studies.…”
mentioning
confidence: 99%
“…There are new kinds of attacks being generated that pose a huge threat to biometric systems in both audio [130] and face [108]. For example, voice impersonation has shown to be causing a considerable vulnerability to automatic speaker recognition [83]. However, the databases or protocols to create such attacks are not publicly available.…”
Section: ) Presentation Attack Database For Av Biometricsmentioning
confidence: 99%