The Power of Scale for Parameter-Efficient Prompt Tuning

Lester, Brian; Al‐Rfou, Rami; Constant, Noah

doi:10.48550/arxiv.2104.08691

Cited by 197 publications

(401 citation statements)

References 27 publications

Supporting

Mentioning

311

Contrasting

Order By: Relevance

“…While manually crafting prompts [Brown et al, 2020] [Radford et al, 2021is intuitive, creating and experimenting with these prompts takes time and experience, even experienced prompt designers may fail to manually discover optimal prompts . To automate prompt engineering, ] [Lester et al, 2021] [Zhou et al, 2021 paramerized the prompts by treating prompts as virtual tokens and perform prompting directly in the embedding space.…”

Section: Prompt Tuning Methods In Nlpmentioning

confidence: 99%

“…Following [Lester et al, 2021], we initialize the class-specific prompts p c to maximize the likelihood of P (y pred = y|p c ). However, just like the part (e) in the Figure1, there are significant differences among the content of different affective images even though they are in the same class.…”

Section: Diversified Prompts Compositionmentioning

confidence: 99%

“…(3) Shared invariant prompts: Some work in NLP treat prompts as sequences of virtual tokens and learns prompts automatically by parameterizing these tokens [Lester et al, 2021], avoiding the suboptimal problem to some extent . However, these methods employed shared prompts across all instances, regardless of the fact that instances of different categories share similar features while also have their own distinct characteristics.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Learning to Compose Diversified Prompts for Image Emotion Classification

Deng¹,

Wu²,

Shi³

et al. 2022

Preprint

View full text Add to dashboard Cite

Contrastive Language-Image Pre-training (CLIP) represents the latest incarnation of pre-trained vision-language models. Although CLIP has recently shown its superior power on a wide range of downstream vision-language tasks like Visual Question Answering, it is still underexplored for Image Emotion Classification (IEC). Adapting CLIP to the IEC task has three significant challenges, tremendous training objective gap between pretraining and IEC, shared suboptimal and invariant prompts for all instances. In this paper, we propose a general framework that shows how CLIP can be effectively applied to IEC. We first introduce a prompt tuning method that mimics the pretraining objective of CLIP and thus can leverage the rich image and text semantics entailed in CLIP. Then we automatically compose instance-specific prompts by conditioning them on the categories and image contents of instances, diversifying prompts and avoiding suboptimal problems. Evaluations on six widely-used affective datasets demonstrate that our proposed method outperforms the state-of-theart methods to a large margin (i.e., up to 9.29% accuracy gain on EmotionROI dataset) on IEC tasks, with only a few parameters trained. Our codes will be publicly available for research purposes.

show abstract

Section: Prompt Tuning Methods In Nlpmentioning

confidence: 99%

Section: Diversified Prompts Compositionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Learning to Compose Diversified Prompts for Image Emotion Classification

Deng¹,

Wu²,

Shi³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Different from the traditional approaches that encode the sentence into a set of vectors and then classify their sentiment through a fully connected layer, the prompt-based method will construct a set of templates, for example: ("I am always happy to see you, the sentence's sentiment is [MASK]"), and then ask the model to predict the token [mask] according to the original training task for the PLM. This approach has gone through various stages, from manual template construction [Jiang et al 2020], to automated search for discrete tokens [Shin et al 2020], to continuous virtual Tokon representations [Lester et al 2021;Li and Liang 2021]. It has achieved a great success in few-shot scenarios.…”

Section: Fine-tuningmentioning

confidence: 99%

“…The approach achieves impressive results on some generative tasks such as data-to-text. An extension of the model, namely P-tuning [Lester et al 2021], serves a similar purpose. Different from prefix-tuning [Li and Liang 2021], p-tuning does not place prompt with the "prefix" in the input, but constructs a suitable template to prompt the PLM, and the template is composed of continuous virtual token which is obtained through gradient descent.…”

Section: Fine-tuningmentioning

confidence: 99%

A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models

Zhang¹,

Song²,

Li³

et al. 2022

Preprint

View full text Add to dashboard Cite

Controllable Text Generation (CTG) is emerging area in the field of natural language generation (NLG). It is regarded as crucial for the development of advanced text generation technologies that are more natural and better meet the specific constraints in practical applications. In recent years, methods using large-scale pre-trained language models (PLMs), in particular the widely used transformer-based PLMs, have become a new paradigm of NLG, allowing generation of more diverse and fluent text. However, due to the lower level of interpretability of deep neural networks, the controllability of these methods need to be guaranteed. To this end, controllable text generation using transformer-based PLMs has become a rapidly growing yet challenging new research hotspot. A diverse range of approaches have emerged in the recent 3-4 years, targeting different CTG tasks which may require different types of controlled constraints. In this paper, we present a systematic critical review on the common tasks, main approaches and evaluation methods in this area. Finally, we discuss the challenges that the field is facing, and put forward various promising future directions. To the best of our knowledge, this is the first survey paper to summarize CTG techniques from the perspective of PLMs. We hope it can help researchers in related fields to quickly track the academic frontier, providing them with a landscape of the area and a roadmap for future research.

show abstract

The Solution of Huawei Cloud & Noah’s Ark Lab to the NLPCC-2020 Challenge: Light Pre-Training Chinese Language Model for NLP Task

Zhang

Wang

et al. 2020

Natural Language Processing and Chinese Computing

View full text Add to dashboard Cite

The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability. However, these methods do not fully exploit the answers generated by the LLM to guide subsequent responses. This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP), that enables automatic multiple interactions between users and LLMs by using previously generated answers as hints to progressively guide toward the correct answers. PHP is orthogonal to CoT and self-consistency, making it easy to combine with state-of-the-art techniques to further improve performance. We conducted an extensive and comprehensive evaluation to demonstrate the effectiveness of the proposed method. Our experimental results on six benchmarks show that combining CoT and self-consistency with PHP significantly improves accuracy while remaining highly efficient. For instance, with text-davinci-003, we observed a 4.2% improvement on GSM8K with greedy decoding compared to Complex CoT, and a 46.17% reduction in sample paths with self-consistency. With GPT-4 and PHP, we achieve state-of-the-art performances on SVAMP (91.9%), GSM8K (95.5%) and AQuA (79.9%).Preprint. Under review.

show abstract

The Power of Scale for Parameter-Efficient Prompt Tuning

Cited by 197 publications

References 27 publications

Learning to Compose Diversified Prompts for Image Emotion Classification

Learning to Compose Diversified Prompts for Image Emotion Classification

A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models

The Solution of Huawei Cloud & Noah’s Ark Lab to the NLPCC-2020 Challenge: Light Pre-Training Chinese Language Model for NLP Task

Contact Info

Product

Resources

About