Proceedings of the 30th ACM International Conference on Multimedia 2022
DOI: 10.1145/3503161.3547798
|View full text |Cite
|
Sign up to set email alerts
|

On Leveraging Variational Graph Embeddings for Open World Compositional Zero-Shot Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 23 publications
0
2
0
Order By: Relevance
“…For example, Nagarajan et al [26] build a composition space by simulating all the visual changes of attributes performed on objects. Anwaar et al [1] improve composition learning by building a composition graph. Recent approaches [19,27,38], rooted in Vision-Language Models (VLM), also adopt either of the two strategies, utilizing pre-trained VLM encoders to better encode and align images and texts.…”
Section: Related Workmentioning
confidence: 99%
“…For example, Nagarajan et al [26] build a composition space by simulating all the visual changes of attributes performed on objects. Anwaar et al [1] improve composition learning by building a composition graph. Recent approaches [19,27,38], rooted in Vision-Language Models (VLM), also adopt either of the two strategies, utilizing pre-trained VLM encoders to better encode and align images and texts.…”
Section: Related Workmentioning
confidence: 99%
“…To compose the seen primitives into unseen compositions, two challenges must be considered. Firstly, there are semantic entanglements between objects and attributes (Atzmon et al 2021;Anwaar, Pan, and Kleinsteuber 2022). For an image labeled as ancient-building, it is hard to tell which visual features can be captured as a building, and which, as ancient.…”
Section: Introductionmentioning
confidence: 99%