2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.01507
|View full text |Cite
|
Sign up to set email alerts
|

Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(10 citation statements)
references
References 42 publications
0
1
0
Order By: Relevance
“…Pseudo-query generation is critical in zero-shot localization methods, although limited work has been done in this direction. Nam et al (2021) introduce pseudo-query generation for video localization, and subsequently, Jiang et al (2022) for language grounding in images. Nam et al (2021) consider a pseudoquery to be an unordered list of nouns and verbs, obtained from an off-the-shelf object detector and a fine-tuned language model (LM) that predicts the most probable verbs conditioned on the nouns.…”
Section: Weakly Supervised and Zero-shot Nlvl Methodsmentioning
confidence: 99%
“…Pseudo-query generation is critical in zero-shot localization methods, although limited work has been done in this direction. Nam et al (2021) introduce pseudo-query generation for video localization, and subsequently, Jiang et al (2022) for language grounding in images. Nam et al (2021) consider a pseudoquery to be an unordered list of nouns and verbs, obtained from an off-the-shelf object detector and a fine-tuned language model (LM) that predicts the most probable verbs conditioned on the nouns.…”
Section: Weakly Supervised and Zero-shot Nlvl Methodsmentioning
confidence: 99%
“…These fully supervised REC, However, depends on large annotated datasets. Weakly supervised methods (Liu et al, 2019;Sun et al, 2021) don't require manually annotated bounding boxes and unsupervised methods (Jiang et al, 2022) that require neither manually annotated bounding boxes nor queries have also been studied. Pseudo-Q (Jiang et al, 2022) proposed a method for generating pseudo queries with objects, attributes, and spatial relationships as key components, outperforming the weakly supervised methods.…”
Section: Referring Expression Comprehensionmentioning
confidence: 99%
“…The task of Referring Expression Comprehension (ReC) plays a crucial role in applications such as robot navigation and visual question answering. ReC methods can be roughly classified into three types: fully supervised (Deng et al, 2021;, weakly supervised (Gupta et al, 2020;Liu et al, 2019aSun et al, 2021), and unsupervised (Jiang et al, 2022;Subramanian et al, 2022;Wang and Specia, 2019;Yeh et al, 2018).…”
Section: Referring Expression Comprehensionmentioning
confidence: 99%
“…To address the annotation challenges, several works (Yeh et al, 2018;Wang and Specia, 2019;Jiang et al, 2022) have explored unsupervised approaches that do not rely on paired annotations. Nonetheless, these approaches either employ statistical hypothesis testing, make simple assumptions, or only investigate the shallow relation between objects, leading to poor performance in complex scenes.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation