2014
DOI: 10.1007/978-3-319-10584-0_29
|View full text |Cite
|
Sign up to set email alerts
|

ConceptMap: Mining Noisy Web Data for Concept Learning

Abstract: Abstract. We attack the problem of learning concepts automatically from noisy Web image search results. The idea is based on discovering common characteristics shared among subsets of images by posing a method that is able to organise the data while eliminating irrelevant instances. We propose a novel clustering and outlier detection method, namely Concept Map (CMAP). Given an image collection returned for a concept query, CMAP provides clusters pruned from outliers. Each cluster is used to train a model repre… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 18 publications
(12 citation statements)
references
References 51 publications
0
10
0
Order By: Relevance
“…There have been early efforts on automatically training visual classifiers using web data to build large datasets automatically [26,39,42,27,20], find iconic images [5,35] and improve image retrieval results [19,28,45,34]. Inspired by the success of mixture models for object detection [18], some recent approaches such as [9,17] have also explored clustering web data and training detection models.…”
Section: Related Workmentioning
confidence: 99%
“…There have been early efforts on automatically training visual classifiers using web data to build large datasets automatically [26,39,42,27,20], find iconic images [5,35] and improve image retrieval results [19,28,45,34]. Inspired by the success of mixture models for object detection [18], some recent approaches such as [9,17] have also explored clustering web data and training detection models.…”
Section: Related Workmentioning
confidence: 99%
“…Our research is closely related to the recent work on visual data collection from web images [42,3,8,14] or weakly annotated videos [2]. Their goal is to collect training images from the Internet with minimum human supervision, but for predefined concepts.…”
Section: Related Workmentioning
confidence: 99%
“…Computer vision researchers have long collected image examples of manually selected visual concepts, and used them to train concept detectors. For example, ImageNet [6] selects 21,841 synsets in WordNet as the visual concepts, and has by far collected 14,197,122 images in total. One limitation of the manually selected concepts is that their visual detectors often fail to capture the complexity of the visual world, and cannot adapt to different domains.…”
Section: Introductionmentioning
confidence: 99%
“…With web data retrieved for a given concept, there will also be common characteristics shared among subsets of data. Therefore, rather than hard clustering data into a specific number of subsets as some approaches which also aim to deal with intra-class variations in concepts [17], [23], we use hierarchy clustering which allows different clusters to share the same instances. We adopt OPTICS ("Ordering Points To Identify the Clustering Structure") [9] to find clusters.…”
Section: Shot Clusteringmentioning
confidence: 99%