Adjustable deterministic pseudonymization of speech

Dubagunta, S. Pavankumar; Son, R.J.J.H. van; Magimai.-Doss, Mathew

doi:10.1016/j.csl.2021.101284

Cited by 5 publications

(4 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…por el contrario, si existe mayor cantidad de información el riesgo disminuye y mayor es el rango de protección (Dubagunta et al, 2022;Mahanan et al, 2021).…”

Section: Calcular El Riesgounclassified

Técnicas de anonimización y pseudonimización en la protección de datos personales

Córdova-Real,

López-Sevilla

2024

MQRInvestigar

View full text Add to dashboard Cite

Las técnicas de anonimización y pseudonimización ofrecen a los usuarios la protección de sus datos personales para evitar que sean difundidos y utilizados con un propósito ajeno para los cuales fueron recolectados. El presente articulo tiene como objetivo comparar las diversas técnicas de protección de datos personales como la anonimización y pseudonimización. Para ello, se examina a detalle su fundamentación teórica y características inherentes de cada técnica, los procedimientos de aplicación, beneficios y limitaciones que trae consigo la aplicación de estas metodologías. Para investigar acerca de la temática, se emplea la metodología PRISMA que permite buscar, seleccionar y analizar la literatura científica, dando como resultado de la recopilación de documentos mediante el empleo de gestores de búsqueda un total de 32 artículos científicos que fueron desarrollados entre 2018-2023 y contienen información relevante que aporta al desarrollo del presente artículo. Los resultados indican que la técnica de anonimización está enfocada en presentar datos no identificables mediante la adición de ruido, permutaciones o privacidad diferencial que contribuyen a mantener la privacidad de los datos y preservar la utilidad de los mismos. Paralelamente, la pseudonimización tiene como objetivo reemplazar la información inicial identificable con seudónimos que mantengan protegida la identidad de una persona. Como conclusión del estudio, se definen las técnicas de protección de información personal. Estas estrategias son fundamentales para identificar datos considerados como confidenciales para los usuarios, aplicando métodos de privatización que reduzcan los riesgos inherentes al compartir información con terceros; logrando el equilibrio entre utilidad y la reserva de información.

show abstract

“…por el contrario, si existe mayor cantidad de información el riesgo disminuye y mayor es el rango de protección (Dubagunta et al, 2022;Mahanan et al, 2021).…”

Section: Calcular El Riesgounclassified

Técnicas de anonimización y pseudonimización en la protección de datos personales

Córdova-Real,

López-Sevilla

2024

MQRInvestigar

View full text Add to dashboard Cite

show abstract

“…Widening of formant peaks [15] further distorts the spectral envelope. Data-driven formant modification can also be applied by using the formant statistics of desired speakers [16] or time-scale algorithms [18]. Phonetically controllable anonymization [17] modifies a speaker's vocal tract and voice source features, with a focus on F0 trajectories.…”

Section: B Existing Speaker Anonymization Approachesmentioning

confidence: 99%

“…Several approaches to protect speaker privacy are based on digital signal processing (DSP) methods [11], [12], [14], [15], [16], [17], [18], which modify instantaneous speech characteristics such as the pitch, spectral envelope, and time scaling. State-of-the-art anonymization approaches have borrowed ideas from neural speech conversion and synthesis, mainly focusing on disentangled latent representation learning [10], [19], [20], [21], [22], [23], [24], [25] via two hypotheses.…”

Section: Introductionmentioning

confidence: 99%

Speaker Anonymization Using Orthogonal Householder Neural Network

Miao,

Wang,

Cooper

et al. 2023

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

Speaker anonymization aims to conceal a speaker's identity while preserving content information in speech. Current mainstream neural-network speaker anonymization systems disentangle speech into prosody-related, content, and speaker representations. The speaker representation is then anonymized by a selection-based speaker anonymizer that uses a mean vector over a set of randomly selected speaker vectors from an external pool of English speakers. However, the resulting anonymized vectors are subject to severe privacy leakage against powerful attackers, reduction in speaker diversity, and language mismatch problems for unseen-language speaker anonymization. To generate diverse, language-neutral speaker vectors, this paper proposes an anonymizer based on an orthogonal Householder neural network (OHNN). Specifically, the OHNN acts like a rotation to transform the original speaker vectors into anonymized speaker vectors, which are constrained to follow the distribution over the original speaker vector space. A basic classification loss is introduced to ensure that anonymized speaker vectors from different speakers have unique speaker identities. To further protect speaker identities, an improved classification loss and similarity loss are used to push original-anonymized sample pairs away from each other. Experiments on VoicePrivacy Challenge datasets in English and the AISHELL-3 dataset in Mandarin demonstrate the proposed anonymizer's effectiveness.

show abstract

“…Generally, the majority of previous anonymization systems can be broadly classified into two classes: signal processing based systems and x-vector based systems. The signal processing based methods don't require any training data and directly modify formant, fundamental frequency, or other signal-related attributes of the speech signal to achieve anonymization [5][6][7]. These systems provide higher naturalness and are more distinguishable but less effective at protecting the speaker identity.…”

Section: Introductionmentioning

confidence: 99%

Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling

Yao¹,

Wang²,

Long³

et al. 2022

Preprint

View full text Add to dashboard Cite

Speech data on the Internet are proliferating exponentially because of the emergence of social media, and the sharing of such personal data raises obvious security and privacy concerns. One solution to mitigate these concerns involves concealing speaker identities before sharing speech data, also referred to as speaker anonymization. In our previous work, we have developed an automatic speaker verification (ASV)-model-free anonymization framework to protect speaker privacy while preserving speech intelligibility. Although the framework ranked first place in VoicePrivacy 2022 challenge, the anonymization was imperfect, since the speaker distinguishability of the anonymized speech was deteriorated. To address this issue, in this paper, we directly model the formant distribution and fundamental frequency (F0) to represent speaker identity and anonymize the source speech by the uniformly scaling formant and F0. By directly scaling the formant and F0, the speaker distinguishability degradation of the anonymized speech caused by the introduction of other speakers is prevented. The experimental results demonstrate that our proposed framework can improve the speaker distinguishability and significantly outperforms our previous framework in voice distinctiveness. Furthermore, our proposed method also can trade off the privacy-utility by using different scaling factors.

show abstract

Adjustable deterministic pseudonymization of speech

Cited by 5 publications

References 35 publications

Técnicas de anonimización y pseudonimización en la protección de datos personales

Técnicas de anonimización y pseudonimización en la protección de datos personales

Speaker Anonymization Using Orthogonal Householder Neural Network

Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling

Contact Info

Product

Resources

About