CVPR 2011 2011
DOI: 10.1109/cvpr.2011.5995711
|View full text |Cite
|
Sign up to set email alerts
|

Recognition using visual phrases

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

5
366
0

Year Published

2012
2012
2018
2018

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 378 publications
(371 citation statements)
references
References 11 publications
5
366
0
Order By: Relevance
“…In the second experiment, we learn object and attribute classifiers jointly and predict object-attribute pairs (e.g. predicting that an apple is red), as in Sadeghi and Farhadi (2011).…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…In the second experiment, we learn object and attribute classifiers jointly and predict object-attribute pairs (e.g. predicting that an apple is red), as in Sadeghi and Farhadi (2011).…”
Section: Methodsmentioning
confidence: 99%
“…Related Work Comparison It is also worth mentioning in this section some prior work on relationships. The concept of visual relationships has already been explored in Visual Phrases (Sadeghi and Farhadi 2011), who introduced a dataset of 17 such relationships such as next_to (person, bike) and riding (person, horse). However, their dataset is limited to just these 17 relationships.…”
Section: Top Relationship Distributionsmentioning
confidence: 99%
See 1 more Smart Citation
“…Each subcategory has reduced appearance diversity (via improved alignment), leading to a simpler learning problem. The recent success of the discriminatively-trained mixture model framework of Felzenszwalb et al, [8] has led to the wide popularity of such models for object detection [14,17,18,20,23]. Applying such model to the four images in Figure 1(a) would likely result in each being assigned to a separate subcategory and trained with others of its kind.…”
Section: Introductionmentioning
confidence: 99%
“…Gupta et al [3] use the AND-OR graph formalism to represent spatiotemporal relations among objects and actions in videos. Sadeghi and Farhadi [4] examine the scale of unit at which to categorize objects, and develop a notion of visual phrases for jointly recognizing co-occurring objects. Farhadi et al [5] develop image models that indicate the presence of object, action, scene triplets.…”
Section: Introductionmentioning
confidence: 99%