Hierarchical Neural Story Generation

Fan, Angela; Lewis, Mike; Dauphin, Yann N.

doi:10.18653/v1/p18-1082

Cited by 759 publications

(874 citation statements)

References 21 publications

Supporting

Mentioning

811

Contrasting

Unclassified

Order By: Relevance

“…In this work, we perform an in-depth study of the properties of text generated by GPT2-117 (the smallest version of GPT2) in the context of story generation. By comparing to a state-of-theart, specialized-architecture neural story generation model (Fan et al, 2018), we ask the following questions. In what ways does a large amount of open-domain pretraining data change the characteristics of generated text?…”

Section: Introductionmentioning

confidence: 99%

“…To enable readers to browse the generated text, conduct their own evaluations, or run our evaluations on their own text, we publicly release our generated stories and evaluation code. 1 2 Background WritingPrompts dataset WritingPrompts (Fan et al, 2018) is a story generation dataset containing 303,358 human-written (prompt, story) pairs collected from the /r/WritingPrompts subreddita forum where Reddit users compose short stories inspired by other users' prompts. An example can be seen at the top of Table 2.…”

Section: Introductionmentioning

confidence: 99%

“…Though anecdotal evidence suggests that these models generate better quality text, there has been no detailed study characterizing their generation abilities. In this work, we compare the performance of an extensively pretrained model, OpenAI GPT2-117 (Radford et al, 2019), to a state-of-the-art neural story generation model (Fan et al, 2018). By evaluating the generated text across a wide variety of automatic metrics, we characterize the ways in which pretrained models do, and do not, make better storytellers.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Do Massively Pretrained Language Models Make Better Storytellers?

See¹,

Pappu²,

Saxena³

et al. 2019

Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

105

122

View full text Add to dashboard Cite

Large neural language models trained on massive amounts of text have emerged as a formidable strategy for Natural Language Understanding tasks. However, the strength of these models as Natural Language Generators is less clear. Though anecdotal evidence suggests that these models generate better quality text, there has been no detailed study characterizing their generation abilities. In this work, we compare the performance of an extensively pretrained model, OpenAI GPT2-117 (Radford et al., 2019), to a state-of-the-art neural story generation model (Fan et al., 2018). By evaluating the generated text across a wide variety of automatic metrics, we characterize the ways in which pretrained models do, and do not, make better storytellers. We find that although GPT2-117 conditions more strongly on context, is more sensitive to ordering of events, and uses more unusual words, it is just as likely to produce repetitive and under-diverse text when using likelihood-maximizing decoding algorithms. Danielle S McNamara, Scott A Crossley, and Philip M McCarthy. 2010. Linguistic features of writing quality. Written communication, 27(1):57-86. Jekaterina Novikova, Ondřej Dušek, Amanda Cercas Curry, and Verena Rieser. 2017. Why we need new evaluation metrics for NLG. In Empirical Methods in Natural Language Processing.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

Do Massively Pretrained Language Models Make Better Storytellers?

See¹,

Pappu²,

Saxena³

et al. 2019

Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

105

122

View full text Add to dashboard Cite

show abstract

“…Automatic story generation has a long history, with early work based primarily on hand-written rules (Klein et al, 1973;Meehan, 1977;Dehn, 1981;Turner, 1993). Subsequent methods were based on planning from artificial intelligence (Theune et al, 2003;Oinonen et al, 2006;Riedl and Young, 2010) and, more recently, data-driven methods have been developed (McIntyre and Lap ata, 2010;Elson, 2012;Daza et al, 2016;Roemmele and Gordon, 2015;Clark et al, 2018a;Martin et al, 2018;Fan et al, 2018b;Yao et al, 2019;Fan et al, 2019). In concurrent work, Gupta et al (2019) also propose methods to generate more diverse and interesting story endings, albeit without control variables.…”

Section: Related Workmentioning

confidence: 99%

Generating Diverse Story Continuations with Controllable Semantics

Ding

Yu³

et al. 2019

Proceedings of the 3rd Workshop on Neural Generation and Translation

View full text Add to dashboard Cite

We propose a simple and effective modeling framework for controlled generation of multiple, diverse outputs. We focus on the setting of generating the next sentence of a story given its context. As controllable dimensions, we consider several sentence attributes, including sentiment, length, predicates, frames, and automatically-induced clusters. Our empirical results demonstrate: (1) our framework is accurate in terms of generating outputs that match the target control values;(2) our model yields increased maximum metric scores compared to standard n-best list generation via beam search; (3) controlling generation with semantic frames leads to a stronger combination of diversity and quality than other control variables as measured by automatic metrics. We also conduct a human evaluation to assess the utility of providing multiple suggestions for creative writing, demonstrating promising results for the potential of controllable, diverse generation in a collaborative writing system.

show abstract

“…During decoding for generation we try three decoding schemes: (i) Greedy: which selects the most probable word at each step, (ii) Top-k (Fan et al, 2018): which at each step samples from the K most probable words, and (iii) Nucleus Sampling (NS) (Holtzman et al, 2019): which at each step samples from a flexible subset of most probable words chosen based on their cumulative mass (set by a threshold p, where p = 1 means sampling from the full distribution). While similar to Topk, the benefit of NS scheme is that the vocabulary size at each time step of decoding varies, a property that encourages diversity and avoids degenerate text patterns of greedy or beam search decoding (Holtzman et al, 2019).…”

Section: Text Generationmentioning

confidence: 99%

On the Importance of the Kullback-Leibler Divergence Term in Variational Autoencoders for Text Generation

Prokhorov¹,

Shareghi²,

Li³

et al. 2019

Proceedings of the 3rd Workshop on Neural Generation and Translation

View full text Add to dashboard Cite

Variational Autoencoders (VAEs) are known to suffer from learning uninformative latent representation of the input due to issues such as approximated posterior collapse, or entanglement of the latent space. We impose an explicit constraint on the Kullback-Leibler (KL) divergence term inside the VAE objective function. While the explicit constraint naturally avoids posterior collapse, we use it to further understand the significance of the KL term in controlling the information transmitted through the VAE channel. Within this framework, we explore different properties of the estimated posterior distribution, and highlight the trade-off between the amount of information encoded in a latent code during training, and the generative capacity of the model. 1

show abstract

Hierarchical Neural Story Generation

Cited by 759 publications

References 21 publications

Do Massively Pretrained Language Models Make Better Storytellers?

Do Massively Pretrained Language Models Make Better Storytellers?

Generating Diverse Story Continuations with Controllable Semantics

On the Importance of the Kullback-Leibler Divergence Term in Variational Autoencoders for Text Generation

Contact Info

Product

Resources

About