The Audio Auditor: User-Level Membership Inference in Internet of Things Voice Services

Miao, Yuantian; Xue, Minhui; Chen, Chao; Pan, Lei; Zhang, Jun; Zhao, Benjamin Zi Hao; Kaafar, Dali; Xiang, Yang

doi:10.2478/popets-2021-0012

Cited by 15 publications

(9 citation statements)

References 34 publications

(61 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the literature, there are only a few studies on user-level MI attacks. In Miao et al (2021), the authors investigate MI attacks on speech recognition task to infer if any users' data (voice samples) have been used during training. In Song & Shmatikov (2019), the authors propose a user-level MI attack on text generative models.…”

Section: Membership Inferencementioning

confidence: 99%

“…Baselines: To the best of our knowledge, there is no user-level MI attack on metric embedding learning. The two user-level MI attacks in literature (Song & Shmatikov, 2019;Miao et al, 2021) require generative models where the victim model's output is a word. Hence, there is no trivial way to adopt them for metric embedding scenario.…”

Section: Experimental Settingsmentioning

confidence: 99%

See 1 more Smart Citation

User-Level Membership Inference Attack against Metric Embedding Learning

Li¹,

Rezaei²,

Liu³

2022

Preprint

View full text Add to dashboard Cite

Membership inference (MI) determines if a sample was part of a victim model training set. Recent development of MI attacks focus on record-level membership inference which limits their application in many real-world scenarios. For example, in the person re-identification task, the attacker (or investigator) is interested in determining if a user's images have been used during training or not. However, the exact training images might not be accessible to the attacker. In this paper, we develop a user-level MI attack where the goal is to find if any sample from the target user has been used during training even when no exact training sample is available to the attacker. We focus on metric embedding learning due to its dominance in person re-identification, where user-level MI attack is more sensible. We conduct an extensive evaluation on several datasets and show that our approach achieves high accuracy on user-level MI task.

show abstract

Section: Membership Inferencementioning

confidence: 99%

Section: Experimental Settingsmentioning

confidence: 99%

User-Level Membership Inference Attack against Metric Embedding Learning

Li¹,

Rezaei²,

Liu³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Specifically, a service provider that trains an ML model with user data without adequate consent may violate data protection regulations. In this case, MI can be used to assert whether or not a data sample was used during training, protecting both users and service providers [8,9,10].…”

Section: Introductionmentioning

confidence: 99%

“…MI is, thus, an important aspect of trustworthy machine learning, that should be studied in all its facets and for all types of data. However, while MI has been extensively studied in the realms of image and text data [11], the focus on speech data, particularly in what concerns ASR models, remains limited [12,13,9,14,15,10].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Untitled

2020

View full text Add to dashboard Cite

Membership Inference (MI) poses a substantial privacy threat to the training data of Automatic Speech Recognition (ASR) systems, while also offering an opportunity to audit these models with regard to user data. This paper explores the effectiveness of loss-based features in combination with Gaussian and adversarial perturbations to perform MI in ASR models. To the best of our knowledge, this approach has not yet been investigated. We compare our proposed features with commonly used error-based features and find that the proposed features greatly enhance performance for sample-level MI. For speaker-level MI, these features improve results, though by a smaller margin, as error-based features already obtained a high performance for this task. Our findings emphasise the importance of considering different feature sets and levels of access to target models for effective MI in ASR systems, providing valuable insights for auditing such models.

show abstract