Déjà Image-Captions: A Corpus of Expressive Descriptions in Repetition

Chen, Jianfu; Kuznetsova, Polina; Warren, David S.; Choi, Yejin

doi:10.3115/v1/n15-1053

Cited by 26 publications

(26 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…An extensive overview of the datasets available for image captioning is provided by [3]. The three biggest datasets are MS COCO [17], SBU1M Captions [20], Deja-Image Captions [4]. Work done by [14] and [29] has achieved state-of-the-art results in image captioning.…”

Section: Description Of Images-in-isolationmentioning

confidence: 99%

Stories for Images-in-Sequence by Using Visual and Narrative Components

Smilevski

Lalkovski²,

Madjarov

2018

Communications in Computer and Information Science

View full text Add to dashboard Cite

Recent research in AI is focusing towards generating narrative stories about visual scenes. It has the potential to achieve more human-like understanding than just basic description generation of imagesin-sequence. In this work, we propose a solution for generating stories for images-in-sequence that is based on the Sequence to Sequence model. As a novelty, our encoder model is composed of two separate encoders, one that models the behaviour of the image sequence and other that models the sentence-story generated for the previous image in the sequence of images. By using the image sequence encoder we capture the temporal dependencies between the image sequence and the sentence-story and by using the previous sentence-story encoder we achieve a better story flow. Our solution generates long human-like stories that not only describe the visual context of the image sequence but also contains narrative and evaluative language. The obtained results were confirmed by manual human evaluation.

show abstract

Section: Description Of Images-in-isolationmentioning

confidence: 99%

Stories for Images-in-Sequence by Using Visual and Narrative Components

Smilevski

Lalkovski²,

Madjarov

2018

Communications in Computer and Information Science

View full text Add to dashboard Cite

show abstract

“…• Déjà Images Dataset (Chen et al, 2015) consists of 180K unique user-generated captions associated with 4M Flickr images, where one caption is aligned with multiple images. This dataset was collected by querying Flickr for 693 high frequency nouns, then further filtered to have at least one verb and be judged as "good" captions by workers on Amazon's Mechanical Turk (Turkers).…”

Section: User-generated Captionsmentioning

confidence: 99%

A Survey of Current Datasets for Vision and Language Research

Ferraro¹,

Mostafazadeh²,

Huang³

et al. 2015

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Integrating vision and language has long been a dream in work on artificial intelligence (AI). In the past two years, we have witnessed an explosion of work that brings together vision and language from images to videos and beyond. The available corpora have played a crucial role in advancing this area of research. In this paper, we propose a set of quality metrics for evaluating and analyzing the vision & language datasets and categorize them accordingly. Our analyses show that the most recent datasets have been using more complex language and more abstract concepts, however, there are different strengths and weaknesses in each.

show abstract

“…Moreover, there are studies collecting paraphrases from captions to videos (Chen and Dolan, 2011) and images (Chen et al, 2015). One advantage of leveraging crowdsourcing is that annotation is done inexpensively, but it requires careful task design to gather valid data from non-expert annotators.…”

Section: Related Workmentioning

confidence: 99%

Building a Non-Trivial Paraphrase Corpus Using Multiple Machine Translation Systems

Suzuki¹,

Kajiwara

Komachi

2017

Proceedings of ACL 2017, Student Research Workshop

View full text Add to dashboard Cite

We propose a novel sentential paraphrase acquisition method. To build a wellbalanced corpus for Paraphrase Identification, we especially focus on acquiring both non-trivial positive and negative instances. We use multiple machine translation systems to generate positive candidates and a monolingual corpus to extract negative candidates. To collect nontrivial instances, the candidates are uniformly sampled by word overlap rate. Finally, annotators judge whether the candidates are either positive or negative. Using this method, we built and released the first evaluation corpus for Japanese paraphrase identification, which comprises 655 sentence pairs.

show abstract

Déjà Image-Captions: A Corpus of Expressive Descriptions in Repetition

Cited by 26 publications

References 27 publications

Stories for Images-in-Sequence by Using Visual and Narrative Components

Stories for Images-in-Sequence by Using Visual and Narrative Components

A Survey of Current Datasets for Vision and Language Research

Building a Non-Trivial Paraphrase Corpus Using Multiple Machine Translation Systems

Contact Info

Product

Resources

About