2021 IEEE Winter Conference on Applications of Computer Vision (WACV) 2021
DOI: 10.1109/wacv48630.2021.00118
|View full text |Cite
|
Sign up to set email alerts
|

Compositional Learning of Image-Text Query for Image Retrieval

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
23
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 58 publications
(24 citation statements)
references
References 14 publications
1
23
0
Order By: Relevance
“…The proposed methodology shares bases with studies in [1], [25], and [47]. These bases include the query inputs, the use of neural networks, and testing results on the Fashion 200K dataset.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…The proposed methodology shares bases with studies in [1], [25], and [47]. These bases include the query inputs, the use of neural networks, and testing results on the Fashion 200K dataset.…”
Section: Methodsmentioning
confidence: 99%
“…In [47], features composition using an autoencoder called ComposeAE to compose query muti-modals. Image features were extracted using CNN-ResNet17.…”
Section: A Comparative Study Compositional Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…PerVL arises in various scenarios. In image retrieval, a user may tag a few of their images and wish to retrieve other photos of that concept in a visual specific context (Chen et al, 2020;Anwaar et al, 2021); in human-robot interaction, a worker may show a specific tool to a robotic arm, and instruct how to use it (Wang et al, 2022;Lynch & Sermanet, 2020); in video security applications, an operator may search for one specific known item in the context of other items or people described using language.…”
Section: Introductionmentioning
confidence: 99%
“…MIT_STATE [9] contains 63440 pictures and 245 types of objects, each type of object has an average of 9 adjectives to describe, these adjectives emphasize the state of the object and the state transformation between pictures of similar objects, such as "old" and "new". According to the citations, most of the ones exploiting this dataset are image retrieval models [28,29,30], which are used as benchmarks to compare with other models on R@k.…”
Section: Introductionmentioning
confidence: 99%