2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.766
|View full text |Cite
|
Sign up to set email alerts
|

ViP-CNN: Visual Phrase Guided Convolutional Neural Network

Abstract: As the intermediate level task connecting image captioning and object detection, visual relationship detection started to catch researchers' attention because of its descriptive power and clear structure. It detects the objects and captures their pair-wise interactions with a subjectpredicate-object triplet, e.g. person-ride-horse . In this paper, each visual relationship is considered as a phrase with three components. We formulate the visual relationship detection as three inter-connected recognition problem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
180
0

Year Published

2018
2018
2019
2019

Publication Types

Select...
5
3

Relationship

3
5

Authors

Journals

citations
Cited by 227 publications
(183 citation statements)
references
References 58 publications
(115 reference statements)
0
180
0
Order By: Relevance
“…The Visual Genome (VG) dataset is one of the largest relationship detection datasets. We note that there are multiple versions of VG datasets [20,33,34,37]. In this paper, we use the pruned version of the VG dataset provided by [37].…”
Section: Datasets Evaluation Tasks and Metricsmentioning
confidence: 99%
“…The Visual Genome (VG) dataset is one of the largest relationship detection datasets. We note that there are multiple versions of VG datasets [20,33,34,37]. In this paper, we use the pruned version of the VG dataset provided by [37].…”
Section: Datasets Evaluation Tasks and Metricsmentioning
confidence: 99%
“…In the object-pairs proposing stage, [16] proposes a triplet proposal with NMS, based on the product of objectiveness scores, to remove redundant object-pairs. However, there exists a gap between higher objectiveness scores and more meaningful objectpairs obviously.…”
Section: Related Workmentioning
confidence: 99%
“…whereP is the objectiveness score from object detection module. Inspired by greedy NMS [11] and triplet NMS [16], shown in Algorithm 1, object-pairs proposing scheme is based on rating scores and improved NMS(i-NMS).…”
Section: Rating Scores and I-nms Based Object-pair Proposingmentioning
confidence: 99%
See 1 more Smart Citation
“…The task of Visual Relationship Detection has been the main focus of several recent works (Lu et al, 2016;Li et al, 2017a;Zhang et al, 2017a;Dai et al, 2017;Hu et al, 2017;Liang et al, 2017;Yin et al, 2018). The goal is to detect a generic <subject, predicate, object> triplet present in an image.…”
Section: Related Workmentioning
confidence: 99%