Differentially Private Fine-tuning of Language Models

Yu, Dahua; Naik, Saurabh; Bačkurs, Artūrs; Gopi, Sivakanth; Inan, Huseyin A.; Kamath, Gautam; Kulkarni, Janardhan; Tat, Lee, Yin; Manoel, Andre; Wutschitz, Lukas; Yekhanin, Sergey; Huishuai, Zhang,

doi:10.48550/arxiv.2110.06500

Cited by 15 publications

(27 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Training data privacy can be protected using the differential privacy (DP) framework (Dwork et al, 2006), which ensures that the effect of any single training example on the trained model is not too large. Yu et al (2021); Li et al (2022) demonstrate the practicality of training differentially private LMs. However, the privacy guarantees achieved by differentially private training are weaker than reported when datasets contain duplicated records.…”

Section: Discussionmentioning

confidence: 99%

Deduplicating Training Data Mitigates Privacy Risks in Language Models

Kandpal¹,

Wallace²,

Raffel³

2022

Preprint

View full text Add to dashboard Cite

Past work has shown that large language models are susceptible to privacy attacks, where adversaries generate sequences from a trained model and detect which sequences are memorized from the training set. In this work, we show that the success of these attacks is largely due to duplication in commonly used web-scraped training sets. We first show that the rate at which language models regenerate training sequences is superlinearly related to a sequence's count in the training set. For instance, a sequence that is present 10 times in the training data is on average generated ∼1000× more often than a sequence that is present only once. We next show that existing methods for detecting memorized sequences have nearchance accuracy on non-duplicated training sequences. Finally, we find that after applying methods to deduplicate training data, language models are considerably more secure against these types of privacy attacks. Taken together, our results motivate an increased focus on deduplication in privacysensitive applications and a reevaluation of the practicality of existing privacy attacks.

show abstract

Section: Discussionmentioning

confidence: 99%

Deduplicating Training Data Mitigates Privacy Risks in Language Models

Kandpal¹,

Wallace²,

Raffel³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Most work on Differential Privacy [6,10,22,24,32,44,47,53,53,57] uses public data either for: generic pre-training unrelated to the task [49], or to tune parameters [25,58,60], or as additional unlabeled data [37,38]. Instead, we use a small amount of labeled public data related to the task to improve the accuracy of a private model under a given privacy parameter .…”

Section: Related Workmentioning

confidence: 99%

“…DP for Language models. [34,57] show that a large language model pre-trained on generic public data can be finetuned on task-specific private data with only modest loss in accuracy. In contrast, our focus is on using small amounts of public data for fine-tuning.…”

Section: Related Workmentioning

confidence: 99%

“…Such public data are distinct from the private ones, for which we seek strong privacy guarantees. 1 In particular, using large amounts of generic public data for pre-training has recently enabled the creation of language models that achieve DP on the target task while remaining close to state-of-the-art performance [34,57]. The same strategy in vision [49], however, still yields punishing error increases (Tab.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Mixed Differential Privacy in Computer Vision

Golatkar¹,

Achille²,

Wang³

et al. 2022

Preprint

View full text Add to dashboard Cite

We introduce AdaMix, an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data. While pre-training language models on large public datasets has enabled strong differential privacy (DP) guarantees with minor loss of accuracy, a similar practice yields punishing trade-offs in vision tasks. A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset. AdaMix incorporates few-shot training, or cross-modal zero-shot learning, on public data prior to private fine-tuning, to improve the trade-off. AdaMix reduces the error increase from the non-private upper bound from the 167-311% of the baseline, on average across 6 datasets, to 68-92% depending on the desired privacy level selected by the user. AdaMix tackles the trade-off arising in visual classification, whereby the most privacy sensitive data, corresponding to isolated points in representation space, are also critical for high classification accuracy. In addition, AdaMix comes with strong theoretical privacy guarantees and convergence analysis.

show abstract

“…To address these privacy concerns, there is a growing body of literature that aims to create privacypreserving language models [64,2,56,98,84,40,79]. While humans navigate the complexities of language and privacy by identifying appropriate contexts for sharing information, LMs are not currently designed to do this [14,72,66,49,66,50,41].…”

Section: Introductionmentioning

confidence: 99%

What Does it Mean for a Language Model to Preserve Privacy?

Brown¹,

Lee²,

Mireshghallah³

et al. 2022

Preprint

View full text Add to dashboard Cite

Natural language reflects our private lives and identities, making its privacy concerns as broad as those of real life. Language models lack the ability to understand the context and sensitivity of text, and tend to memorize phrases present in their training sets. An adversary can exploit this tendency to extract training data. Depending on the nature of the content and the context in which this data was collected, this could violate expectations of privacy. Thus, there is a growing interest in techniques for training language models that preserve privacy. In this paper, we discuss the mismatch between the narrow assumptions made by popular data protection techniques (data sanitization and differential privacy), and the broadness of natural language and of privacy as a social norm. We argue that existing protection methods cannot guarantee a generic and meaningful notion of privacy for language models. We conclude that language models should be trained on text data which was explicitly produced for public use.

show abstract

Differentially Private Fine-tuning of Language Models

Cited by 15 publications

References 40 publications

Deduplicating Training Data Mitigates Privacy Risks in Language Models

Deduplicating Training Data Mitigates Privacy Risks in Language Models

Mixed Differential Privacy in Computer Vision

What Does it Mean for a Language Model to Preserve Privacy?

Contact Info

Product

Resources

About