10th ISCA Workshop on Speech Synthesis (SSW 10) 2019
DOI: 10.21437/ssw.2019-28
|View full text |Cite
|
Sign up to set email alerts
|

Speaker Anonymization Using X-vector and Neural Waveform Models

Abstract: The social media revolution has produced a plethora of web services to which users can easily upload and share multimedia documents. Despite the popularity and convenience of such services, the sharing of such inherently personal data, including speech data, raises obvious security and privacy concerns. In particular, a user's speech data may be acquired and used with speech synthesis systems to produce high-quality speech utterances which reflect the same user's speaker identity. These utterances may then be … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
71
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 73 publications
(74 citation statements)
references
References 20 publications
(27 reference statements)
0
71
0
Order By: Relevance
“…Speech data contain much sensitive information that can be used to identify the speaker. In this subsection, we introduce our initial work on speaker anonymization [51]. The speaker anonymization technique we have developed can conceal a speaker's identity while keeping other factors such as linguistic information, naturalness, and quality unchanged.…”
Section: Neural Speaker Anonymization [51]mentioning
confidence: 99%
“…Speech data contain much sensitive information that can be used to identify the speaker. In this subsection, we introduce our initial work on speaker anonymization [51]. The speaker anonymization technique we have developed can conceal a speaker's identity while keeping other factors such as linguistic information, naturalness, and quality unchanged.…”
Section: Neural Speaker Anonymization [51]mentioning
confidence: 99%
“…Finally, the primary baseline of the VoicePrivacy Challenge 2020 (VPC) [25] uses a neural synthesizer [9,26] to synthesize speech given the target x-vector and fundamental frequency and bottleneck features extracted from the source.…”
Section: Anonymization Techniques and Target Selectionmentioning
confidence: 99%
“…Their purpose is to transform speech signals in order to preserve all content except features related with the speaker identity. These techniques include noise addition [4], speech transformation [5], voice conversion [6,7,8], speech synthesis [9], or adversarial learning [10]. As a privacy preservation mechanism, they must achieve a suitable privacy/utility trade-off.…”
Section: Introductionmentioning
confidence: 99%
“…Advanced voice-based speaker anonymization aims to suppress the speaker identity and is mostly based on voice transformation techniques, changing source, or filter characteristics of the speech (Pobar & Ipšić, 2014). Recent research suggested to use x-vector speaker representations to suppress the timbre of a speaker, and thus hindering the speaker identification (Fang et al, 2019). Although these techniques can preserve the textual context and may sound natural, they cannot sustain the emotional expression of a speaker, as they have never aimed that.…”
Section: Anonymization and De-anonymization Techniquesmentioning
confidence: 99%