Interspeech 2021 2021
DOI: 10.21437/interspeech.2021-1783
|View full text |Cite
|
Sign up to set email alerts
|

Configurable Privacy-Preserving Automatic Speech Recognition

Abstract: Voice assistive technologies have given rise to far-reaching privacy and security concerns. In this paper we investigate whether modular automatic speech recognition (ASR) can improve privacy in voice assistive systems by combining independently trained separation, recognition, and discretization modules to design configurable privacy-preserving ASR systems. We evaluate privacy concerns and the effects of applying various stateof-the-art techniques at each stage of the system, and report results using task-spe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(8 citation statements)
references
References 18 publications
0
8
0
Order By: Relevance
“…As an alternative solution, we utilized the wav2vec2 model to produce the log probabilities for all possible characters and then used a convolutional recurrent deep neural network (CRDNN) to transform them into categorical labels (W2V2 logp). These models combine the advantages of convolutional and recurrent neurons and became a popular choice in paralinguistics and ASR [21]. To showcase the benefits of using the wav2vec2 model, we also trained CRDNNs that classify the examples based on their MFCC features (MFCC+CRDNN).…”
Section: Speech Rating Systemmentioning
confidence: 99%
“…As an alternative solution, we utilized the wav2vec2 model to produce the log probabilities for all possible characters and then used a convolutional recurrent deep neural network (CRDNN) to transform them into categorical labels (W2V2 logp). These models combine the advantages of convolutional and recurrent neurons and became a popular choice in paralinguistics and ASR [21]. To showcase the benefits of using the wav2vec2 model, we also trained CRDNNs that classify the examples based on their MFCC features (MFCC+CRDNN).…”
Section: Speech Rating Systemmentioning
confidence: 99%
“…It is very likely that the biometric data obtained in interaction with online services will be utilized to extract a fair amount of personal characteristics, and information about the circumstances and environment of users from their raw data. For example, an overlearning problem might be caused by deep neural networks used for ASR and speaker verification, revealing additional/sensitive information about the users, which threatens to compromise their privacy significantly [16,92].…”
Section: Threats To User Privacymentioning
confidence: 99%
“…Using our pipeline, we recommend validating the data source at the edge before sharing it, see Figure 1. We plan to integrate new applications [14,16,122] with a configurable privacy engine and evaluate the effectiveness of source validation in mitigating spoofing attempts while retaining its usefulness in minimizing privacy intrusions.…”
Section: Future Work and Open Directions 61 Beyond Authenticationmentioning
confidence: 99%
“…Voice-controlled IoT devices and smart home assistants are gaining popularity in people's lives, and spoken language understanding (SLU) has become one of the main enabling technologies to achieve human-machine interaction between users and devices [Coucke et al, 2018]. For example, SLU can be found in the in-car voice system of Tesla electricity vehicles (EVs), the voice assistant of Apple phones and Xiaomi's smart home ecology.…”
Section: Introductionmentioning
confidence: 99%
“…There are certain privacy risks associated with these IOT devices [Atlam and Wills, 2020]. Due to their limited storage space, some vendors choose to deploy only the encoder of the model at the client side and transmit the output of the encoder to the server side for performing tasks such as prediction and classification, which gives opportunities for malicious attackers to take advantage of.…”
Section: Introductionmentioning
confidence: 99%