2021
DOI: 10.1007/978-3-030-87802-3_10
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating X-Vector-Based Speaker Anonymization Under White-Box Assessment

Abstract: In the scenario of the Voice Privacy challenge, anonymization is achieved by converting all utterances from a source speaker to match the same target identity; this identity being randomly selected. In this context, an attacker with maximum knowledge about the anonymization system can not infer the target identity. This article proposed to constrain the target selection to a specific identity, i.e., removing the random selection of identity, to evaluate the extreme threat under a whitebox assessment (the attac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 14 publications
0
4
0
Order By: Relevance
“…The complexity is compounded by how speech synthesis is typically evaluated, often relying on human judgements which naturally vary from listener to listener and are difficult to replicate over decades of research [3,17,18]. Voice anonymization can also be framed as a special type of voice conversion task [6] -one where the original speaker shall not be revealed. Still, voice privacy is evaluated along different metrics or based on different assumptions than voice conversion.…”
Section: Background and Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…The complexity is compounded by how speech synthesis is typically evaluated, often relying on human judgements which naturally vary from listener to listener and are difficult to replicate over decades of research [3,17,18]. Voice anonymization can also be framed as a special type of voice conversion task [6] -one where the original speaker shall not be revealed. Still, voice privacy is evaluated along different metrics or based on different assumptions than voice conversion.…”
Section: Background and Related Workmentioning
confidence: 99%
“…The system D [27] is derived from baseline B2. Systems based on x-vectors include: A [28], M [29], O [30], and S [31]. While system I 2 uses modifications to formants, system K 3 combines x-vectors, speaker similarity models, and a voice indistinguishability metric.…”
Section: Datamentioning
confidence: 99%
See 1 more Smart Citation
“…For the majority of anonymisation systems proposed to date, this requirement is fulfilled by maximising some measure of the distance between the chosen pseudo-speaker embedding and the original speaker embedding. The most popular anonymisation function to date uses a pool of external x-vectors [2,3,4,5,6,7]. The pseudo-speaker embedding is obtained by averaging a random subset of the furthest x-vectors in the pool from the x-vector of the original speaker.…”
Section: Introductionmentioning
confidence: 99%