2019
DOI: 10.48550/arxiv.1909.03396
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Quality Estimation for Image Captions Based on Large-scale Human Evaluations

Abstract: Automatic image captioning has improved significantly in the last few years, but the problem is far from being solved. Furthermore, while the standard automatic metrics, such as CIDEr and SPICE (Vedantam et al., 2015;Anderson et al., 2016), can be used for model selection, they cannot be used at inferencetime given a previously unseen image since they require ground-truth references. In this paper, we focus on the related problem called Quality Estimation (QE) of image-captions. In contrast to automatic metric… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 20 publications
0
3
0
Order By: Relevance
“…ActivityNet Captions [16], MS COCO [26], MSR-VTT [45], Flickr30k Denotations [47], SBU [31], A2D [44], Visual Genome [17], Conceptual Captions [34], Charades [36], Charades-Ego [35], OID [21], TGIF [24], ActivityNet-Entities [49]…”
Section: A Appendix: Dataset Construction A1 Datasets Used For Long S...mentioning
confidence: 99%
“…ActivityNet Captions [16], MS COCO [26], MSR-VTT [45], Flickr30k Denotations [47], SBU [31], A2D [44], Visual Genome [17], Conceptual Captions [34], Charades [36], Charades-Ego [35], OID [21], TGIF [24], ActivityNet-Entities [49]…”
Section: A Appendix: Dataset Construction A1 Datasets Used For Long S...mentioning
confidence: 99%
“…Finally, we perform an experiment to understand the extent to which the quality of the Stabilizer outputs is correlated with the quality of the targetlanguage Captions, so that a QE model (Levinboim et al, 2019) on the Stabilizer outputs). To that end, we perform human evaluations of stand-alone captions.…”
Section: Stabilizers Used For Quality Estimationmentioning
confidence: 99%
“…There is a final additional advantage to having PLuGS models as a solution: in real-world applications of image captioning, quality estimation of the resulting captions is an important component that has recently received attention (Levinboim et al, 2019). Again, labeled data for quality-estimation (QE) is only available for English 2 , and generating it separately for other languages of interest is expensive, time-consuming, and scales poorly.…”
Section: Introductionmentioning
confidence: 99%