Exploring SVM for Image Annotation in Presence of Confusing Labels

Verma, Yashaswi; Jawahar, C. V.

doi:10.5244/c.27.25

Cited by 56 publications

(40 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For a method to be practical for such databases, it has to rely on minimal training as the addition of new images and tags can render the learned models less effective over time. This holds true for both the methods that learn a direct mapping from features to tags [38,3], or those that learn tag-specific discriminative models [15,30,34] where positive set contains images which contain a particular tag and the negative set contains images which do not have that tag. Obviously, as new images and tags are introduced into the database, the positive set for each tag will change, requiring retraining of the models.…”

Section: Introductionmentioning

confidence: 99%

NMF-KNN: Image Annotation Using Weighted Multi-view Non-negative Matrix Factorization

Kalayeh

Idrees

Shah

2014

2014 IEEE Conference on Computer Vision and Pattern Recognition

128

View full text Add to dashboard Cite

The real world image databases such as Flickr are characterized by continuous addition of new images. The recent approaches for image annotation, i.e. the problem of assigning tags to images, have two major drawbacks. First, either models are learned using the entire training data, or to handle the issue of dataset imbalance, tag-specific discriminative models are trained. Such models become obsolete and require relearning when new images and tags are added to database. Second, the task of feature-fusion is typically dealt using ad-hoc approaches. In this paper, we present a weighted extension of Multi-view Non-negative Matrix Factorization (NMF) to address the aforementioned drawbacks. The key idea is to learn query-specific generative model on the features of nearest-neighbors and tags using the proposed NMF-KNN approach which imposes consensus constraint on the coefficient matrices across different features. This results in coefficient vectors across features to be consistent and, thus, naturally solves the problem of feature fusion, while the weight matrices introduced in the proposed formulation alleviate the issue of dataset imbalance. Furthermore, our approach, being query-specific, is unaffected by addition of images and tags in a database. We tested our method on two datasets used for evaluation of image annotation and obtained competitive results.

show abstract

Section: Introductionmentioning

confidence: 99%

NMF-KNN: Image Annotation Using Weighted Multi-view Non-negative Matrix Factorization

Kalayeh

Idrees

Shah

2014

2014 IEEE Conference on Computer Vision and Pattern Recognition

128

View full text Add to dashboard Cite

show abstract

“…Thus each image is annotated with the n most relevant labels (usually, as in this paper, the results are obtained using n = 5). Then, the results are reported as mean precision P and mean recall R over the Previously reported results ML CRM [14] InfNet [19] NPDE [27] MBRM [4] SML [2] TGLM [17] GS [28] JEC-15 [9] TagProp σRK [9] TagProp σSD [9] RF-opt [5] KSVM-VT [26] 2PKNN [25] TagProp σML [9] 2PKNN ML [25] Our best result ground-truth labels; N+ is often used to denote the number of labels with non-zero recall value. Note that each image is forced to be annotated with n labels, even if the image has fewer or more labels in the ground truth.…”

Section: Evaluation Measuresmentioning

confidence: 99%

“…Discriminative models such as support vector machines have also been proposed [7,26]. These methods learn a classifier for each label, and use them to predict whether a test image belongs to the class of images that are annotated with a particular label.…”

Section: Related Workmentioning

confidence: 99%

A Cross-media Model for Automatic Image Annotation

Ballan

Uricchio

Seidenari

et al. 2014

Proceedings of International Conference on Multimedia Retrieval

View full text Add to dashboard Cite

Automatic image annotation is still an important open problem in multimedia and computer vision. The success of media sharing websites has led to the availability of large collections of images tagged with human-provided labels. Many approaches previously proposed in the literature do not accurately capture the intricate dependencies between image content and annotations. We propose a learning procedure based on Kernel Canonical Correlation Analysis which finds a mapping between visual and textual words by projecting them into a latent meaning space. The learned mapping is then used to annotate new images using advanced nearest-neighbor voting methods. We evaluate our approach on three popular datasets, and show clear improvements over several approaches relying on more standard representations.

show abstract

“…Discriminative models such as SML treated multi-labeling as a multi-class problem [1], but this suffers from class imbalance (insufficient training samples per label) and lots of overlap among class specific distributions. Recently, an SVM based model [14] proposed by Verma and Jawahar modified the SVM hinge loss function in order to handle confusing labels. But in our approach, we show that we are able to get better results without any modifications to the SVM model.…”

Section: Related Workmentioning

confidence: 99%

“…One approach to retrieve or manage such large quantities of images/videos is to automatically annotate each test image with multiple keywords by training a statistical model on a labeled training set. Researchers have tried to address this problem by either using a discriminative model [14,1] or a generative model [9,5,7]. Each of these techniques has its own advantages and disadvantages.…”

Section: Introductionmentioning

confidence: 99%

A Hybrid Model for Automatic Image Annotation

Murthy

Can

Manmatha

2014

Proceedings of International Conference on Multimedia Retrieval

View full text Add to dashboard Cite

In this work, we present a hybrid model (SVM-DMBRM) combining a generative and a discriminative model for the image annotation task. A support vector machine (SVM) is used as the discriminative model and a Discrete Multiple Bernoulli Relevance Model (DMBRM) is used as the generative model. The idea of combining both the models is to take advantage of the distinct capabilities of each model. The SVM tries to address the problem of poor annotation (images are not annotated with all relevant keywords), while the DMBRM model tries to address the problem of data imbalance (large variations in number of positive samples). Since DMBRM does not work well with high-dimensional data, a Latent Dirichlet Allocation (LDA) model is used to reduce the dimensionality of vector quantized features before using it. The hybrid model's results are comparable to or better than the state-of-the-art results on three standard datasets: Corel-5k, ESP-Game and IAPRTC-12.

show abstract

Exploring SVM for Image Annotation in Presence of Confusing Labels

Cited by 56 publications

References 21 publications

NMF-KNN: Image Annotation Using Weighted Multi-view Non-negative Matrix Factorization

NMF-KNN: Image Annotation Using Weighted Multi-view Non-negative Matrix Factorization

A Cross-media Model for Automatic Image Annotation

A Hybrid Model for Automatic Image Annotation

Contact Info

Product

Resources

About