2019
DOI: 10.48550/arxiv.1908.06938
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Encoder-Agnostic Adaptation for Conditional Language Generation

Abstract: Large pretrained language models have changed the way researchers approach discriminative natural language understanding tasks, leading to the dominance of approaches that adapt a pretrained model for arbitrary downstream tasks. However it is an open-question how to use similar techniques for language generation. Early results in the encoder-agnostic setting have been mostly negative. In this work we explore methods for adapting a pretrained language model to arbitrary conditional input. We observe that pretra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
14
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(14 citation statements)
references
References 19 publications
0
14
0
Order By: Relevance
“…Feeding latent code to the decoder With a single latent code representation 3 z ∈ R d and a "GPT2" decoder, we investigate three mainstream ways of latent code injection inspired by previous literatures (Cheng et al 2019;Ziegler et al 2019;Wang and Wan 2019).…”
Section: Architecture Designmentioning
confidence: 99%
See 4 more Smart Citations
“…Feeding latent code to the decoder With a single latent code representation 3 z ∈ R d and a "GPT2" decoder, we investigate three mainstream ways of latent code injection inspired by previous literatures (Cheng et al 2019;Ziegler et al 2019;Wang and Wan 2019).…”
Section: Architecture Designmentioning
confidence: 99%
“…As presented in (Ziegler et al 2019) and shown in 1+l)×d are augmented key and value matrices with projected latent code z K , z V from z filling the first row; • • means concatenation by rows. Here, we abbreviate per-layer code z l to z for notation simplicity.…”
Section: Architecture Designmentioning
confidence: 99%
See 3 more Smart Citations