Proceedings of the 30th ACM International Conference on Multimedia 2022
DOI: 10.1145/3503161.3547776
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised and Pseudo-Supervised Vision-Language Alignment in Visual Dialog

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(1 citation statement)
references
References 39 publications
0
1
0
Order By: Relevance
“…This dual focus has enriched the understanding of image-text dynamics. Furthermore, the alignment-based approach model (Chen et al 2022) has shown promise in explicitly aligning visual concepts with textual semantics via unsupervised and pseudo-supervised vision-language alignment. Another intriguing approach (Chen et al 2021;Guo et al 2020;Zhang et al 2022b;Zheng et al 2019) is the graph-based representation suitable for the composite scenario of dialog history and image, which offers a structured way to understand relationships within an image.…”
Section: Visual Dialogmentioning
confidence: 99%
“…This dual focus has enriched the understanding of image-text dynamics. Furthermore, the alignment-based approach model (Chen et al 2022) has shown promise in explicitly aligning visual concepts with textual semantics via unsupervised and pseudo-supervised vision-language alignment. Another intriguing approach (Chen et al 2021;Guo et al 2020;Zhang et al 2022b;Zheng et al 2019) is the graph-based representation suitable for the composite scenario of dialog history and image, which offers a structured way to understand relationships within an image.…”
Section: Visual Dialogmentioning
confidence: 99%