Encoder-Agnostic Adaptation for Conditional Language Generation

Ziegler, Zachary M.; Melas-Kyriazi, Luke; Gehrmann, Sebastian; Rush, Alexander M.

doi:10.48550/arxiv.1908.06938

Cited by 5 publications

(14 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Feeding latent code to the decoder With a single latent code representation 3 z ∈ R d and a "GPT2" decoder, we investigate three mainstream ways of latent code injection inspired by previous literatures (Cheng et al 2019;Ziegler et al 2019;Wang and Wan 2019).…”

Section: Architecture Designmentioning

confidence: 99%

“…As presented in (Ziegler et al 2019) and shown in 1+l)×d are augmented key and value matrices with projected latent code z K , z V from z filling the first row; • • means concatenation by rows. Here, we abbreviate per-layer code z l to z for notation simplicity.…”

Section: Architecture Designmentioning

confidence: 99%

“…By comparing with a state-of-the-art transfer learning method based on GPT-2 models-pseudo self attention (PSA) (Ziegler et al 2019), we compare our CVAE model with a pure supervised training method for conditional generation. The pseudo self attention introduces new projection matrices to absorb a sequence of embedding vectors from input to the self-attention computational framework.…”

Section: Benchmark Modelsmentioning

confidence: 99%

“…The task featured with much longer output leads to higher complexity and more flexibility in a broader space than short text generation. Previous literature (Fan, Lewis, and Dauphin 2018;Mao et al 2019;See et al 2019;Ziegler et al 2019) have at most studied how to effectively learn the mapping between prompt and story through explicit end to end (end2end) training. However, controllability in such a setting has rarely been studied.…”

Section: Introductionmentioning

confidence: 99%

“…story generation as a test bed of open-domain long text generation (Ziegler et al 2019). initiates the research of conditionally generating story based on a pre-trained GPT2.Though achieving promising results, very few works have been presented to improve controllability in the setting of long text.…”

mentioning

confidence: 99%

See 4 more Smart Citations

Transformer-based Conditional Variational Autoencoder for Controllable Story Generation

Fang,

Zeng,

Liu

et al. 2021

Preprint

View full text Add to dashboard Cite

We investigate large-scale latent variable models (LVMs) for neural story generation-an under-explored application for open-domain long text-with objectives in two threads: generation effectiveness and controllability. LVMs, especially the variational autoencoder (VAE), have achieved both effective and controllable generation through exploiting flexible distributional latent representations. Recently, Transformers and its variants have achieved remarkable effectiveness without explicit latent representation learning, thus lack satisfying controllability in generation. In this paper, we advocate to revive latent variable modeling, essentially the power of representation learning, in the era of Transformers to enhance controllability without hurting state-of-the-art generation effectiveness. Specifically, we integrate latent representation vectors with a Transformer-based pre-trained architecture to build conditional variational autoencoder (CVAE). Model components such as encoder, decoder and the variational posterior are all built on top of pre-trained language models-GPT2 specifically in this paper. Experiments demonstrate state-of-the-art conditional generation ability of our model, as well as its excellent representation learning capability and controllability.

show abstract

Section: Architecture Designmentioning

confidence: 99%

Section: Architecture Designmentioning

confidence: 99%

Section: Benchmark Modelsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

mentioning

confidence: 99%

See 3 more Smart Citations

Transformer-based Conditional Variational Autoencoder for Controllable Story Generation

Fang,

Zeng,

Liu

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

AI‐based abstractive text summarization towards AIoT and edge computing

Zhang

2022

Internet Technology Letters

View full text Add to dashboard Cite

The task of text summarization has been widely concerned in the fields of news, document indexing and literature retrieval. Recently, due to the rise of mobile Internet devices, natural language processing for AIoT and edge computing has become a hot spot. This paper focuses on the research of text summarization for AIoT and edge computing. For a long time, abstractive summarization is limited to academic research due to the lack of controllability of the generated content. Recently, the appearance of Transformer has changed the current situation of abstractive text summarization. Transformer follows encoder‐decoder architecture, including attention mechanism and feed‐forward network. The encoder encodes the semantic information of source text, and decoder adaptively selects the effective context information through the attention mechanism to generate a coherent summary. To extract more semantic information and control the generated text better, this paper proposes multi‐scale semantic information Transformer (MSIT). Specifically, we introduce depth‐wise separable convolution to the encoder to extract more local semantic information, so that the attention mechanism can make better use of contextual semantic information. Additionally, we combine the encoding vector of encoder and target summary as the input to the attention layer of the decoder, and introduce time series mechanism so that the decoder can consider context information when generating text. Experiments on CNN‐Daily Mail Dataset show that this model is superior to other methods.

show abstract

Fact-Enhanced Synthetic News Generation

Shu

Ding

et al. 2021

AAAI

View full text Add to dashboard Cite

The advanced text generation methods have witnessed great success in text summarization, language translation, and synthetic news generation. However, these techniques can be abused to generate disinformation and fake news. To better understand the potential threats of synthetic news, we develop a novel generation method FACTGEN to generate high-quality news content. The majority of existing text generation methods either afford limited supplementary information or lose consistency between the input and output which makes the synthetic news less trustworthy. To address these issues, FACTGEN retrieves external facts to enrich the output and reconstructs the input claim from the generated content to improve the consistency among the input and the output. Experiment results on real-world datasets demonstrate that the generated news contents of FACTGEN are consistent and contain rich facts. We also discuss an effective defending technique to identify these synthetic news pieces if FACTGEN was used to generate fake news.

show abstract

Encoder-Agnostic Adaptation for Conditional Language Generation

Cited by 5 publications

References 19 publications

Transformer-based Conditional Variational Autoencoder for Controllable Story Generation

Transformer-based Conditional Variational Autoencoder for Controllable Story Generation

AI‐based abstractive text summarization towards AIoT and edge computing

Fact-Enhanced Synthetic News Generation

Contact Info

Product

Resources

About