2022
DOI: 10.48550/arxiv.2205.00423
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

UTC: A Unified Transformer with Inter-Task Contrastive Learning for Visual Dialog

Abstract: Visual Dialog aims to answer multi-round, interactive questions based on the dialog history and image content. Existing methods either consider answer ranking and generating individually or only weakly capture the relation across the two tasks implicitly by two separate models. The research on a universal framework that jointly learns to rank and generate answers in a single model is seldom explored. In this paper, we propose a contrastive learningbased framework UTC to unify and facilitate both discriminative… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 21 publications
(40 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?