DOI: 10.33540/1709
|View full text |Cite
|
Sign up to set email alerts
|

Image Captioning with External Knowledge

Abstract: This dissertation is dedicated to image captioning, the task of automatically generating a natural language description of a given image. Most modern automatic caption generators are trained to produce a straightforward visual description of what can be directly seen in the image. By contrast, a human-written caption may also include information that cannot be inferred from the image alone: references to image-external world knowledge. Exploring ways to enrich automatic image captioning with contextually relev… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 172 publications
0
1
0
Order By: Relevance
“…This is achieved by incorporating commonsense descriptions to BART, a powerful language generation model [110]. Geographical information guiding factual knowledge retrieval to assist IC was first explored in [111], where visual features together with the extracted facts are inserted in a Transformer encoder-decoder structure, ultimately generating the caption c.…”
Section: Image Captioning (Ic)mentioning
confidence: 99%
“…This is achieved by incorporating commonsense descriptions to BART, a powerful language generation model [110]. Geographical information guiding factual knowledge retrieval to assist IC was first explored in [111], where visual features together with the extracted facts are inserted in a Transformer encoder-decoder structure, ultimately generating the caption c.…”
Section: Image Captioning (Ic)mentioning
confidence: 99%