A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation

Guan, Jian; Huang, Fei; Zhao, Zhihao; Zhu, Xiaoyan; Huang, Minlie

doi:10.1162/tacl_a_00302

Cited by 169 publications

(150 citation statements)

References 27 publications

Supporting

Mentioning

149

Contrasting

Order By: Relevance

“…One future path lies in combining language models with knowledge bases: curated databases of declarative facts. In work presented at last year's Association for Computational Linguistics meeting 9 , researchers fine-tuned GPT-2 on sentences explicitly stating facts and inferences from a compendium of common sense (for instance, if someone cooks spaghetti, that person wants to eat). As a result, it wrote short stories that were more logical.…”

Section: Seeking Common Sensementioning

confidence: 99%

Robo-writers: the rise and risks of language-generating AI

Hutson¹

2021

Nature

View full text Add to dashboard Cite

Section: Seeking Common Sensementioning

confidence: 99%

Robo-writers: the rise and risks of language-generating AI

Hutson¹

2021

Nature

View full text Add to dashboard Cite

“…To address this defect, incorporating external commonsense knowledge to enhance models' reasoning ability has been widely explored (Lin et al, 2019;Ye et al, 2019;Lv et al, 2019). In language generation, previous work (Bhagavatula et al, 2020;Guan et al, 2020) transfers commonsense knowledge into pre-trained language models by utilizing triple information in commonsense knowledge bases such as ConceptNet (Speer and Havasi, 2012) and ATOMIC .…”

Section: Conceptnet Roc Storymentioning

confidence: 99%

Section: Conceptnet Roc Storymentioning

confidence: 99%

“…First, recovering knowledge triples at the posttraining stage (Guan et al, 2020) hardly enables the model to utilize the encoded knowledge in fine-tuning generation tasks which requires reasoning over underlying commonsense knowledge. Second, it ignores the abundant structural relational relevance of the concepts in the knowledge graph (Guan et al, 2020;Bhagavatula et al, 2020) that may provide multiple plausible evidence for complex reasoning. Thus a richer and more explicit way of utilizing external commonsense knowledge is to exploit both structural and semantic information of the knowledge graph and reason over multihop relational paths where multiple connected triples provide chains of evidence for grounded text generation.…”

Section: Conceptnet Roc Storymentioning

confidence: 99%

“…Recently, some work also attempted to integrate external commonsense knowledge into generative pretrained language models such as GPT-2 (Radford et al, 2019). Guan et al (2020) conducted posttraining on sythetic data constructed from commonsense knowledge bases by translating triplets into natural language texts using templates. Bhagavatula et al (2020) transferred embeddings of COMeT (Bosselut et al, 2019), a GPT-2 model fine-tuned to generate the tail entity of a triple in commonsense knowledge graph, into another GPT-2 model for text generation.…”

Section: Commonsense-aware Neural Text Generationmentioning

confidence: 99%

See 2 more Smart Citations

Language Generation with Multi-Hop Reasoning on Commonsense Knowledge Graph

Pei

Huang³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Self Cite

View full text Add to dashboard Cite

Despite the success of generative pre-trained language models on a series of text generation tasks, they still suffer in cases where reasoning over underlying commonsense knowledge is required during generation. Existing approaches that integrate commonsense knowledge into generative pre-trained language models simply transfer relational knowledge by post-training on individual knowledge triples while ignoring rich connections within the knowledge graph. We argue that exploiting both the structural and semantic information of the knowledge graph facilitates commonsenseaware text generation. In this paper, we propose Generation with Multi-Hop Reasoning Flow (GRF) that enables pre-trained models with dynamic multi-hop reasoning on multirelational paths extracted from the external commonsense knowledge graph. We empirically show that our model outperforms existing baselines on three text generation tasks that require reasoning over commonsense knowledge. We also demonstrate the effectiveness of the dynamic multi-hop reasoning module with reasoning paths inferred by the model that provide rationale to the generation.

show abstract