Recognizing jumbled images: The role of local and global information in image classification

Parikh, Devi

doi:10.1109/iccv.2011.6126283

Cited by 32 publications

(17 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This suggests that one of the weaknesses of both the BoB and BoN representations is their lack of explicit encoding of the geometric relationship between different descriptor words. Similar findings have been reported in the context of local descriptor based representations of textured objects [9]. We also investigated the possibility of a simple decision level combination of the two representations.…”

Section: Discussionsupporting

confidence: 57%

Object Matching Using Boundary Descriptors

Arandjelović¹

2012

Procedings of the British Machine Vision Conference 2012

View full text Add to dashboard Cite

The problem of object recognition is of immense practical importance and potential, and the last decade has witnessed a number of breakthroughs in the state of the art. Most of the past object recognition work focuses on textured objects and local appearance descriptors extracted around salient points in an image. These methods fail in the matching of smooth, untextured objects for which salient point detection does not produce robust results. The recently proposed bag of boundaries (BoB) method is the first to directly address this problem. Since the texture of smooth objects is largely uninformative, BoB focuses on describing and matching objects based on their post-segmentation boundaries. Herein we address three major weaknesses of this work. The first of these is the uniform treatment of all boundary segments. Instead, we describe a method for detecting the locations and scales of salient boundary segments. Secondly, while the BoB method uses an image based elementary descriptor (HoGs + occupancy matrix), we propose a more compact descriptor based on the local profile of boundary normals' directions. Lastly, we conduct a far more systematic evaluation, both of the bag of boundaries method and the method proposed here. Using a large public database, we demonstrate that our method exhibits greater robustness while at the same time achieving a major computational saving -object representation is extracted from an image in only 6% of the time needed to extract a bag of boundaries, and the storage requirement is similarly reduced to less than 8%.

show abstract

Section: Discussionsupporting

confidence: 57%

Object Matching Using Boundary Descriptors

Arandjelović¹

2012

Procedings of the British Machine Vision Conference 2012

View full text Add to dashboard Cite

show abstract

“…1, 4 th row [3]. Indeed, Parikh [23] showed a a majority-vote accumulation over human classification of the individual blocks is a good predictor of human responses of the entire jumbled images. This dataset contains human performances on 3 image sets: 1) OSR, 384 outdoor scenes from the 8 categories of 8-CAT, 2) ISR, 300 indoor scenes [5] from bathroom, bedroom, dining room, gym, kitchen, living room, theater and staircase categories, and 3) CAL: Caltech objects (50 images from each of 6 categories aeroplane, car-rear, face, ketch, motorbike, and watch).…”

Section: Human Studiesmentioning

confidence: 99%

“…Human performance on jumbled images depends on the level of image blocking [23] (here 65%). Model accuracies (trained and tested on jumbled images) are shown in Fig.…”

Section: Test 4: Local or Global Information: Recognition Of Jumbled mentioning

confidence: 99%

Human vs. Computer in Scene and Object Recognition

Borji¹,

Itti

2014

2014 IEEE Conference on Computer Vision and Pattern Recognition

View full text Add to dashboard Cite

Several decades of research in computer and primate vision have resulted in many models (some specialized for one problem, others more general) and invaluable experimental data. Here, to help focus research efforts onto the hardest unsolved problems, and bridge computer and human vision, we define a battery of 5 tests that measure the gap between human and machine performances in several dimensions (generalization across scene categories, generalization from images to edge maps and line drawings, invariance to rotation and scaling, local/global information with jumbled images, and object recognition performance). We measure model accuracy and the correlation between model and human error patterns. Experimenting over 7 datasets, where human data is available, and gauging 14 well-established models, we find that none fully resembles humans in all aspects, and we learn from each test which models and features are more promising in approaching humans in the tested dimension. Across all tests, we find that models based on local edge histograms consistently resemble humans more, while several scene statistics or "gist" models do perform well with both scenes and objects. While computer vision has long been inspired by human vision, we believe systematic efforts, such as this, will help better identify shortcomings of models and find new paths forward.

show abstract

“…In the past years, many techniques have been proposed to address the problem of image classification [1][2][3][4][5][6][7][8] . There are two key assumptions in these algorithmic techniques: the first assumption is that images in the database are usually distributed in the Euclidean space, and the second one is that the dissimilarity-based matching is based on the pairwise measure.…”

Section: Introductionmentioning

confidence: 99%

Improving Image Classification Quality Via Dissimilarity Measure In Non-Euclidean Spaces

Zhu

2015

Proceedings of the 2015 International Symposium on Computers &Amp; Informatics

View full text Add to dashboard Cite

This paper proposes an image classification scheme by learning the dissimilarity measure in non-Euclidean spaces. Specifically, the dissimilarity representations of samples from a pseudo-Euclidean space are first constructed; then, the dissimilarity increment distribution information of each category is achieved with respect to the high-order statistics of triplet-neighbor points for each image; finally, a maximum a posteriori algorithm fused with the Gaussian Mixture Model and triplet-dissimilarity increments distribution is utilized to estimate the relevance of each image category with each input image. Experimental results conducted on a general image database demonstrate the effectiveness of the proposed scheme.

show abstract

Recognizing jumbled images: The role of local and global information in image classification

Cited by 32 publications

References 26 publications

Object Matching Using Boundary Descriptors

Object Matching Using Boundary Descriptors

Human vs. Computer in Scene and Object Recognition

Improving Image Classification Quality Via Dissimilarity Measure In Non-Euclidean Spaces

Contact Info

Product

Resources

About