Prompt-free and Efficient Few-shot Learning with Language Models

Mahabadi, Rabeeh Karimi; Zettlemoyer, Luke; Henderson, J.; Mathias, Lambert; Saeidi, Marzieh; Stoyanov, Veselin; Yazdani, Majid

doi:10.18653/v1/2022.acl-long.254

Cited by 30 publications

(27 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For the language tasks, we use PERFECT, a recent adaptation method from Mahabadi et al [48] 9 , which inserts an adapter layer after the feed-forward block of each transformer layer of a RoBERTa-Large model, an early FM consisting of 355M parameters. This results in 3.3M trainable parameters.…”

Section: Methodsmentioning

confidence: 99%

“…Foundation Models We use popular existing FMs and FM adaptation techniques in our evaluations. In particular, we use simple zero-shot [58], in-context learning [10] and lightweight tuning [29,48] methods (defined in Section 3). We are inspired by work on versatile FM systems with natural language interfaces [37,77], though instead of ML methods, our focus is the consequences of these FM capabilities on privacy.…”

Section: Background and Related Workmentioning

confidence: 99%

“…Several recent works propose to fine-tune a small number of parameters (a subset of the existing FM or newly initialized) [29,30,38]. We specifically require lightweight tuning approaches that suffice with few-samples [48,43].…”

Section: Architecturementioning

confidence: 99%

“…Our goal is to understand if FMs can be applied to achieve strong privacy guarantees for personal tasks. We study the behavior of these nascent models in the standard FL setting using popular off-the-shelf FMs [56,72] and methods for FM adaptation [10,29,48,58]. We propose Foundation model Controls for User Secrecy (FOCUS), a framework for privately serving personal tasks, that consists of a unidirectional data flow architecture.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Can Foundation Models Help Us Achieve Perfect Secrecy?

Arora¹,

Ré²

2022

Preprint

View full text Add to dashboard Cite

A key promise of machine learning is the ability to assist users with personal tasks. Because the personal context required to make accurate predictions is often sensitive, we require systems that protect privacy. A gold standard privacy-preserving system will satisfy perfect secrecy, meaning that interactions with the system provably reveal no additional private information to adversaries. This guarantee should hold even as we perform multiple personal tasks over the same underlying data. However, privacy and quality appear to be in tension in existing systems for personal tasks. Neural models typically require lots of training to perform well, while individual users typically hold a limited scale of data, so the systems propose to learn from the aggregate data of multiple users. This violates perfect secrecy and instead, in the last few years, academics have defended these solutions using statistical notions of privacy -i.e., the probability of learning private information about a user should be reasonably low. Given the vulnerabilities of these solutions, we explore whether the strong perfect secrecy guarantee can be achieved using recent zero-to-few sample adaptation techniques enabled by foundation models. In response, we propose FOCUS, a framework for personal tasks. Evaluating on popular privacy benchmarks, we find the approach, satisfying perfect secrecy, competes with strong collaborative learning baselines on 6 of 7 tasks. We empirically analyze the proposal, highlighting the opportunities and limitations based on task types, and model inductive biases and sizes.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Background and Related Workmentioning

confidence: 99%

Section: Architecturementioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Can Foundation Models Help Us Achieve Perfect Secrecy?

Arora¹,

Ré²

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Mahabadi et al [28] found that PEFT can outperform standard fine-tuning in the low-resource setting. In concurrent work, Mahabadi et al [76] compare PEFT to the use of discrete prompts (e.g. PET [70]) during few-shot fine-tuning and find that PEFT compares favorably.…”

Section: Related Workmentioning

confidence: 99%

Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning

Liu¹,

Tam²,

Muqeeth³

et al. 2022

Preprint

View full text Add to dashboard Cite

Few-shot in-context learning (ICL) enables pre-trained language models to perform a previously-unseen task without any gradient-based training by feeding a small number of training examples as part of the input. ICL incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made. Parameter-efficient fine-tuning (e.g. adapter modules, prompt tuning, sparse update methods, etc.) offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task. In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs. Along the way, we introduce a new parameter-efficient fine-tuning method called (IA) 3 that scales activations by learned vectors, attaining stronger performance while only introducing a relatively tiny amount of new parameters. We also propose a simple recipe based on the T0 model [1] called T-Few that can be applied to new tasks without task-specific tuning or modifications. We validate the effectiveness of T-Few on completely unseen tasks by applying it to the RAFT benchmark [2], attaining super-human performance for the first time and outperforming the state-of-the-art by 6% absolute. All of the code used in our experiments is publicly available. 1 * Equal contribution. 1 https://github.com/r-three/t-few Preprint. Under review.

show abstract