Evaluation of speaker de-identification based on voice gender and age conversion

Přibil, Jiří; Přibilová, Anna; Matoušek, Jindřich

doi:10.2478/jee-2018-0017

Cited by 5 publications

(2 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Furthermore, it will be useful to also investigate the influence of adverse acoustic conditions, especially non-stationary noises [29] due to their indirect influence on speech production. Contrary to speaker recognition, it may be interesting to investigate the applicability of long-term spectra for spectral normalization in the field of speaker de-identification [16]. Recently, there has been a growing need for de-identification of multimedia data [18], in order to ensure their anonymity with respect to privacy protection in the European Union.…”

Section: Discussionmentioning

confidence: 99%

Speaker Discrimination Using Long-Term Spectrum of Speech

Sigmund

2019

ITC

View full text Add to dashboard Cite

In this article, we investigate a specific long-term speech spectrum with respect to its use for speaker recognition. The long-term effect was satisfied by averaging short-term autocorrelation coefficients over the whole utterance. The long-term spectrum was calculated by means of second-order linear prediction using the average autocorrelation coefficients. First, speaker discriminability of 32 individual parameters was evaluated by combining spectral energy and spectral slope in eight different frequency bands covering the range 0−4 kHz (seven narrow nonoverlapping subbands and one band spanning over the full range). Then, four subbands with the most discriminative capability were selected for speaker recognition. These subbands involve the frequencies of 0−1.2 kHz in total. In the main experiments, text-independent speaker recognition based on relative Euclidean distance was performed in each single subband as well as in multiple 2 to 4 subbands applying two types of speech data, complete continuous speech and voiced part of the same speech. The voiced speech seems to be generally more effective for speaker recognition using the long-term speech spectrum. The best recognition rates, i.e. 91.7% on complete speech and 100% on voiced speech, were achieved in optimal paired subbands. The long-term speech spectrum can complement the traditional voice features.

show abstract

Section: Discussionmentioning

confidence: 99%

Speaker Discrimination Using Long-Term Spectrum of Speech

Sigmund

2019

ITC

View full text Add to dashboard Cite

show abstract

“…They formulate this as an optimization problem and measure the distance between two speakers with a confusion factor, for which they evaluate entropy and Gini index as metrics. Pribil et al [164] propose a speaker de-identification method that relies on modifying several features of the source speaker. In the first step, the prosodic and spectral features are extracted from the source speaker.…”

Section: Anonymization Techniquesmentioning

confidence: 99%

Privacy-Protecting Techniques for Behavioral Biometric Data: A Survey

Hanisch¹,

Arias-Cabarcos²,

Parra-Arnau³

et al. 2021

Preprint

View full text Add to dashboard Cite

Our behavior -the way we talk, walk, or think-is unique and can be used as a biometric trait. It also correlates with sensitive attributes like emotions. Hence, techniques to protect individuals' privacy against unwanted inferences are required. To consolidate knowledge in this area, we systematically reviewed applicable anonymization techniques. We taxonomize and compare existing solutions regarding privacy goals, conceptual operation, advantages, and limitations. Our analysis shows that some behavioral traits (e.g., voice) have received much attention, while others (e.g., eye-gaze, brainwaves) are mostly neglected. We also find that the evaluation methodology of behavioral anonymization techniques can be further improved.CCS Concepts: • Security and privacy → Pseudonymity, anonymity and untraceability.

show abstract