2019
DOI: 10.48550/arxiv.1912.03817
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Machine Unlearning

Abstract: Once users have shared their data online, it is generally difficult for them to revoke access and ask for the data to be deleted. Machine learning (ML) exacerbates this problem because any model trained with said data may have memorized it, putting users at risk of a successful privacy attack exposing their information. Yet, having models unlearn is notoriously difficult. After a data point is removed from a training set, one often resorts to entirely retraining downstream models from scratch.We introduce SISA… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
37
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(37 citation statements)
references
References 30 publications
0
37
0
Order By: Relevance
“…Important to our framework is the observation that the PAC-Bayesian bound (6), and hence also (7), hold uniformly over all choices of the learning algorithm P W |D . As such, one can optimize the right-hand side of (7) ove the learning algorithm P W |D by considering the problem min P W |D F IRM .…”
Section: Lemma 31 Let Q W |D Denote a Data-dependent Prior For Any (M...mentioning
confidence: 99%
See 2 more Smart Citations
“…Important to our framework is the observation that the PAC-Bayesian bound (6), and hence also (7), hold uniformly over all choices of the learning algorithm P W |D . As such, one can optimize the right-hand side of (7) ove the learning algorithm P W |D by considering the problem min P W |D F IRM .…”
Section: Lemma 31 Let Q W |D Denote a Data-dependent Prior For Any (M...mentioning
confidence: 99%
“…By minimizing an upper bound on the population loss, the learning criterion (7) facilitates generalization. This approach is known as Information Risk Minimization (IRM) [3], and it amounts to the minimization of a free energy criterion [10].…”
Section: Lemma 31 Let Q W |D Denote a Data-dependent Prior For Any (M...mentioning
confidence: 99%
See 1 more Smart Citation
“…However, such implicit knowledge is hard to update, i.e. remove certain information (Bourtoule et al 2019), change or add new data and labels. Additionally, parametric knowledge may perform worse for less frequent facts, which don't appear often in the training set, and "hallucinate" responses.…”
Section: Introductionmentioning
confidence: 99%
“…Some investigate how training data can be memorized in model parameters or outputs [20,3] so as to show the importance of data removal. Others study data removal methods from trained models, especially those that does not require retraining the model [4,2]. However, independent of how data is removed, in order to meet the compliance of data privacy regulations, it is important, especially for healthcare applications such as medical imaging analysis, to have a robust data auditing process to verify if certain data are used in a trained model.…”
Section: Introductionmentioning
confidence: 99%