2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023
DOI: 10.1109/cvpr52729.2023.00670
|View full text |Cite
|
Sign up to set email alerts
|

Cross-Domain Image Captioning with Discriminative Finetuning

Roberto Dessì,
Michele Bevilacqua,
Eleonora Gualdoni
et al.

Abstract: Neural captioners are typically trained to mimic humangenerated references without optimizing for any specific communication goal, leading to problems such as the generation of vague captions. In this paper, we show that fine-tuning an out-of-the-box neural captioner with a selfsupervised discriminative communication objective helps to recover a plain, visually descriptive language that is more informative about image contents. Given a target image, the system must learn to produce a description that enables a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 35 publications
(65 reference statements)
0
1
0
Order By: Relevance
“…One possible future direction for research is to explore how Mac-EmCom can be integrated with more traditional approaches to NLP, such as rule-based or statistical methods. For example, researchers could investigate how Mac-EmCom could be used as a pretraining regime for large language models, allowing them to learn faster and with less data, similar to what has been done by [115], [164], [165].…”
Section: ) Open Challengesmentioning
confidence: 99%
“…One possible future direction for research is to explore how Mac-EmCom can be integrated with more traditional approaches to NLP, such as rule-based or statistical methods. For example, researchers could investigate how Mac-EmCom could be used as a pretraining regime for large language models, allowing them to learn faster and with less data, similar to what has been done by [115], [164], [165].…”
Section: ) Open Challengesmentioning
confidence: 99%