The Power of Scale for Parameter-Efficient Prompt Tuning

Lester, Brian; Al‐Rfou, Rami; Constant, Noah

doi:10.18653/v1/2021.emnlp-main.243

Cited by 950 publications

(783 citation statements)

References 24 publications

Supporting

Mentioning

505

Contrasting

Unclassified

Order By: Relevance

“…However, the selection of discrete prompts is still an independent process and cannot be optimized together with the downstream tasks in an end-to-end manner. To solve this issue, (Lester et al, 2021;Li and Liang, 2021) propose to use soft prompts, which are sets of trainable vectors, in the frozen pretrained language models. Unlike the hard prompts, these vectors do not correspond to any real words.…”

Section: Related Workmentioning

confidence: 99%

“…However, finding suitable discrete task introductions cannot be easily optimized in an end-to-end fashion and requires extra human effort. In this paper, inspired by the recent work (Lester et al, 2021;Li and Liang, 2021), we replace the task introductions with Soft Prompt (i.e., a sequence of continuous and trainable vectors). During training, we only update the parameters of this Soft Prompt and fix all PLMs parameters.…”

Section: Prompt-based Learningmentioning

confidence: 99%

“…Unlike Lester et al (2021) which only prepends Soft Prompt at the input layer, inspired by Adaptor (Houlsby et al, 2019) which adds trainable Multi-layer Perceptron (MLP) at each transformer layer, we prepend a sequence of trainable vectors at each transformer layer. We denote P j = {p j 1 , • • • , p j k } as the Soft Prompt at the j th layer.…”

Section: Prompt-based Learningmentioning

confidence: 99%

“…As a result, the generated synthetic data could be very similar to the original training instances and cannot provide new training signals to the NLU models. Recently, several works (Lester et al, 2021;Li and Liang, 2021) propose prompt tuning, which only back-propagates the error to Soft Prompts (i.e., a sequence of continuous vectors prepended to the input of PLMs) instead of the entire model. They show that prompt tuning is sufficient to be competitive with full model tuning while significantly reducing the amount of parameters to be tuned.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks

Wang¹,

Xu²,

Sun³

et al. 2022

Preprint

View full text Add to dashboard Cite

This paper focuses on the Data Augmentation for low-resource Natural Language Understanding (NLU) tasks. We propose Promptbased Data Augmentation model (PromDA) which only trains small-scale Soft Prompt (i.e., a set of trainable vectors) in the frozen Pre-trained Language Models (PLMs). This avoids human effort in collecting unlabeled indomain data and maintains the quality of generated synthetic data. In addition, PromDA generates synthetic data via two different views and filters out the low-quality data using NLU models. Experiments on four benchmarks show that synthetic data produced by PromDA successfully boost up the performance of NLU models which consistently outperform several competitive baseline models, including a state-of-the-art semi-supervised model using unlabeled in-domain data. The synthetic data from PromDA are also complementary with unlabeled in-domain data. The NLU models can be further improved when they are combined for training.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Prompt-based Learningmentioning

confidence: 99%

Section: Prompt-based Learningmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks

Wang¹,

Xu²,

Sun³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Specifically, we follow the definition of toxicity from Perspective API as well as the inspiration by the recent work from Prompt Engineering (Liu et al, 2021b;Shin et al, 2020;Li & Liang, 2021;Lester et al, 2021;Zhao et al, 2021;Schick & Schütze, 2020a; that repeating the prompts and prompting LMs in the format of Question Answering, and design the prompts below to study the generation and understanding power of the LMs, 1. Negative Prompt (for once).…”

Section: A41 Unconditional Generationmentioning

confidence: 99%

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models

Wang¹,

Wei²,

Xiao³

et al. 2022

Preprint

View full text Add to dashboard Cite

Pre-trained language models (LMs) are shown to easily generate toxic language. In this work, we systematically explore domain-adaptive training to reduce the toxicity of language models. We conduct this study on three dimensions: training corpus, model size, and parameter efficiency. For the training corpus, we propose to leverage the generative power of LMs and generate nontoxic datasets for domain-adaptive training, which mitigates the exposure bias and is shown to be more data-efficient than using a curated pretraining corpus. We demonstrate that the selfgeneration method consistently outperforms the existing baselines across various model sizes on both automatic and human evaluations, even when it uses a 1 3 smaller training corpus. We then comprehensively study detoxifying LMs with parameter sizes ranging from 126M up to 530B (3× larger than GPT-3), a scale that has never been studied before. We find that i) large LMs have similar toxicity levels as smaller ones given the same pre-training corpus, and ii) large LMs require more endeavor to detoxify. We also explore parameter-efficient training methods for detoxification. We demonstrate that adding and training adapter-only layers in LMs not only saves a lot of parameters but also achieves a better trade-off between toxicity and perplexity than whole model adaptation for the large-scale models.

show abstract

Geometric Tuning of Single‐Atom FeN₄ Sites via Edge‐Generation Enhances Multi‐Enzymatic Properties

et al. 2023

View full text Add to dashboard Cite

Single‐atom nanozymes (SAzymes) are considered promising alternatives to natural enzymes. The catalytic performance of SAzymes featuring homogeneous, well‐defined active structures can be enhanced through elucidating structure‐activity relationship and tailoring physicochemical properties. However, manipulating enzymatic properties through structural variation is an underdeveloped approach. Herein, the synthesis of edge‐rich Fe single‐atom nanozymes (FeNC‐edge) via an H2O2‐mediated edge generation is reported. By controlling the number of edge sites, the peroxidase (POD)‐ and oxidase (OXD)‐like performance is significantly enhanced. The activity enhancement results from the presence of abundant edges, which provide new anchoring sites to mononuclear Fe. Experimental results combined with density functional theory (DFT) calculations reveal that FeN4 moieties in the edge sites display high electron density of Fe atoms and open N atoms. Finally, it is demonstrated that FeNC‐edge nanozyme effectively inhibits tumor growth both in vitro and in vivo, suggesting that edge‐tailoring is an efficient strategy for developing artificial enzymes as novel catalytic therapeutics.

show abstract

The Power of Scale for Parameter-Efficient Prompt Tuning

Cited by 950 publications

References 24 publications

PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks

PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models

Geometric Tuning of Single‐Atom FeN₄ Sites via Edge‐Generation Enhances Multi‐Enzymatic Properties

Contact Info

Product

Resources

About

The Power of Scale for Parameter-Efficient Prompt Tuning

Cited by 950 publications

References 24 publications

PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks

PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models

Geometric Tuning of Single‐Atom FeN4 Sites via Edge‐Generation Enhances Multi‐Enzymatic Properties

Contact Info

Product

Resources

About

Geometric Tuning of Single‐Atom FeN₄ Sites via Edge‐Generation Enhances Multi‐Enzymatic Properties