Procedings of the British Machine Vision Conference 2013 2013
DOI: 10.5244/c.27.25
|View full text |Cite
|
Sign up to set email alerts
|

Exploring SVM for Image Annotation in Presence of Confusing Labels

Abstract: We address the problem of automatic image annotation in large vocabulary datasets. In such datasets, for a given label, there could be several other labels that act as its confusing labels. Three possible factors for this are (i) incomplete-labeling ("cars" vs. "vehicle"), (ii) label-ambiguity ("flowers" vs. "blooms"), and (iii) structural-overlap ("lion" vs. "tiger"). While previous studies in this domain have mostly focused on nearest-neighbour based models, we show that even the conventional one-vs-rest SVM… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
40
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 56 publications
(40 citation statements)
references
References 21 publications
0
40
0
Order By: Relevance
“…For a method to be practical for such databases, it has to rely on minimal training as the addition of new images and tags can render the learned models less effective over time. This holds true for both the methods that learn a direct mapping from features to tags [38,3], or those that learn tag-specific discriminative models [15,30,34] where positive set contains images which contain a particular tag and the negative set contains images which do not have that tag. Obviously, as new images and tags are introduced into the database, the positive set for each tag will change, requiring retraining of the models.…”
Section: Introductionmentioning
confidence: 99%
“…For a method to be practical for such databases, it has to rely on minimal training as the addition of new images and tags can render the learned models less effective over time. This holds true for both the methods that learn a direct mapping from features to tags [38,3], or those that learn tag-specific discriminative models [15,30,34] where positive set contains images which contain a particular tag and the negative set contains images which do not have that tag. Obviously, as new images and tags are introduced into the database, the positive set for each tag will change, requiring retraining of the models.…”
Section: Introductionmentioning
confidence: 99%
“…Thus each image is annotated with the n most relevant labels (usually, as in this paper, the results are obtained using n = 5). Then, the results are reported as mean precision P and mean recall R over the Previously reported results ML CRM [14] InfNet [19] NPDE [27] MBRM [4] SML [2] TGLM [17] GS [28] JEC-15 [9] TagProp σRK [9] TagProp σSD [9] RF-opt [5] KSVM-VT [26] 2PKNN [25] TagProp σML [9] 2PKNN ML [25] Our best result ground-truth labels; N+ is often used to denote the number of labels with non-zero recall value. Note that each image is forced to be annotated with n labels, even if the image has fewer or more labels in the ground truth.…”
Section: Evaluation Measuresmentioning
confidence: 99%
“…Discriminative models such as support vector machines have also been proposed [7,26]. These methods learn a classifier for each label, and use them to predict whether a test image belongs to the class of images that are annotated with a particular label.…”
Section: Related Workmentioning
confidence: 99%
“…Discriminative models such as SML treated multi-labeling as a multi-class problem [1], but this suffers from class imbalance (insufficient training samples per label) and lots of overlap among class specific distributions. Recently, an SVM based model [14] proposed by Verma and Jawahar modified the SVM hinge loss function in order to handle confusing labels. But in our approach, we show that we are able to get better results without any modifications to the SVM model.…”
Section: Related Workmentioning
confidence: 99%
“…One approach to retrieve or manage such large quantities of images/videos is to automatically annotate each test image with multiple keywords by training a statistical model on a labeled training set. Researchers have tried to address this problem by either using a discriminative model [14,1] or a generative model [9,5,7]. Each of these techniques has its own advantages and disadvantages.…”
Section: Introductionmentioning
confidence: 99%