2021
DOI: 10.48550/arxiv.2105.14445
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Modeling Text-visual Mutual Dependency for Multi-modal Dialog Generation

Abstract: Multi-modal dialog modeling is of growing interest. In this work, we propose frameworks to resolve a specific case of multi-modal dialog generation that better mimics multi-modal dialog generation in the real world, where each dialog turn is associated with the visual context in which it takes place. Specifically, we propose to model the mutual dependency between text-visual features, where the model not only needs to learn the probability of generating the next dialog utterance given preceding dialog utteranc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
3

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 72 publications
0
1
0
Order By: Relevance
“…For two sentences of the same meaning, the probability of generating contexts given the two sentences should be also the same, which correspond to the backward probability given from sentences to contexts. This is akin to the bi-directional mutual-information based generation strategy (Fang et al, 2015;Li et al, 2016a;Li and Jurafsky, 2016;Wang et al, 2021). The backward probability can be modeled by predicting preceding contexts given subsequent contexts p(c <i |c i , c >i ) and to predict subsequent contexts given preceding contexts p(c >i |c <i , c i ).…”
Section: Training Context-lmmentioning
confidence: 99%
“…For two sentences of the same meaning, the probability of generating contexts given the two sentences should be also the same, which correspond to the backward probability given from sentences to contexts. This is akin to the bi-directional mutual-information based generation strategy (Fang et al, 2015;Li et al, 2016a;Li and Jurafsky, 2016;Wang et al, 2021). The backward probability can be modeled by predicting preceding contexts given subsequent contexts p(c <i |c i , c >i ) and to predict subsequent contexts given preceding contexts p(c >i |c <i , c i ).…”
Section: Training Context-lmmentioning
confidence: 99%