Proceedings of the 24th Conference on Computational Natural Language Learning 2020
DOI: 10.18653/v1/2020.conll-1.34
|View full text |Cite
|
Sign up to set email alerts
|

Diverse and Relevant Visual Storytelling with Scene Graph Embeddings

Abstract: A problem in automatically generated stories for image sequences is that they use overly generic vocabulary and phrase structure and fail to match the distributional characteristics of human-generated text. We address this problem by introducing explicit representations for objects and their relations by extracting scene graphs from the images. Utilizing an embedding of this scene graph enables our model to more explicitly reason over objects and their relations during story generation, compared to the global … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(4 citation statements)
references
References 36 publications
0
1
0
Order By: Relevance
“…Other forms of language where phonemic and phonetic information are essential have also been generated, including rap (Xue et al 2021;Manjavacas, Kestemont, and Karsdorp 2019;Potash, Romanov, and Rumshisky 2018) and song lyrics more generally (Tian et al 2023;Chang et al 2023;Zhang et al 2022). Computational research on creative works is not restricted to such domains, however, with extensive work also existing in the area of narrative generation (Hong et al 2023;Tang et al 2022;Chen et al 2021), humor generation (Loakman, Maladry, and Lin 2023;Sun et al 2022;Tian, Sheth, and Peng 2022;He, Peng, and Liang 2019), metaphor processing (Wang et al 2023;Li et al 2023a;Li, Guerin, and Lin 2022), and music generation (Li et al 2024;Yu et al 2023).…”
Section: Creative Language Generationmentioning
confidence: 99%
“…Other forms of language where phonemic and phonetic information are essential have also been generated, including rap (Xue et al 2021;Manjavacas, Kestemont, and Karsdorp 2019;Potash, Romanov, and Rumshisky 2018) and song lyrics more generally (Tian et al 2023;Chang et al 2023;Zhang et al 2022). Computational research on creative works is not restricted to such domains, however, with extensive work also existing in the area of narrative generation (Hong et al 2023;Tang et al 2022;Chen et al 2021), humor generation (Loakman, Maladry, and Lin 2023;Sun et al 2022;Tian, Sheth, and Peng 2022;He, Peng, and Liang 2019), metaphor processing (Wang et al 2023;Li et al 2023a;Li, Guerin, and Lin 2022), and music generation (Li et al 2024;Yu et al 2023).…”
Section: Creative Language Generationmentioning
confidence: 99%
“…More recent visual storytelling approaches (Xu et al 2021;Chen et al 2021;Hsu et al 2020;Yang et al 2019) introduce external knowledge to give models the necessary commonsense to reason. Sometimes, scene graphs are used to model the relations between objects (Lu et al 2016;Hong et al 2020;Wang et al 2020). However, none of the existing approaches explicitly consider character information -characters are just treated like other objects.…”
Section: Related Workmentioning
confidence: 99%
“…Motivation for the present work is also provided by recent research exploring the visual correlates of inferences, temporal and causal relationships (e.g., Park et al, 2020), which also have implications for generation. In visual storytelling, for instance, a model has to understand actions and interactions among the visually depicted entities (Huang et al, 2016;Lukin et al, 2018;Hong et al, 2023). Identifying actions is a prerequisite for predicting their motivations or rationales as well as explaining automatically generated descriptions of images (Hendricks et al, 2018).…”
Section: Introductionmentioning
confidence: 99%