ConceptMap: Mining Noisy Web Data for Concept Learning

Golge, Eren; Duygulu, Pınar

doi:10.1007/978-3-319-10584-0_29

Cited by 18 publications

(12 citation statements)

References 51 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There have been early efforts on automatically training visual classifiers using web data to build large datasets automatically [26,39,42,27,20], find iconic images [5,35] and improve image retrieval results [19,28,45,34]. Inspired by the success of mixture models for object detection [18], some recent approaches such as [9,17] have also explored clustering web data and training detection models.…”

Section: Related Workmentioning

confidence: 99%

Sense discovery via co-clustering on images and text

Chen

Ritter

Gupta

et al. 2015

2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

We present a co-clustering framework that can be used to discover multiple semantic and visual senses of a given Noun Phrase (NP). Unlike traditional clustering approaches which assume a one-to-one mapping between the clusters in the text-based feature space and the visual space, we adopt a one-to-many mapping between the two spaces. This is primarily because each semantic sense (concept) can correspond to different visual senses due to viewpoint and appearance variations. Our structure-EM style optimization not only extracts the multiple senses in both semantic and visual feature space, but also discovers the mapping between the senses. We introduce a challenging dataset (CMU Polysemy-30) for this problem consisting of 30 NPs (∼5600 labeled instances out of ∼22K total instances). We have also conducted a large-scale experiment that performs sense disambiguation for ∼2000 NPs.

show abstract

Section: Related Workmentioning

confidence: 99%

Sense discovery via co-clustering on images and text

Chen

Ritter

Gupta

et al. 2015

2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

show abstract

“…Our research is closely related to the recent work on visual data collection from web images [42,3,8,14] or weakly annotated videos [2]. Their goal is to collect training images from the Internet with minimum human supervision, but for predefined concepts.…”

Section: Related Workmentioning

confidence: 99%

“…Computer vision researchers have long collected image examples of manually selected visual concepts, and used them to train concept detectors. For example, ImageNet [6] selects 21,841 synsets in WordNet as the visual concepts, and has by far collected 14,197,122 images in total. One limitation of the manually selected concepts is that their visual detectors often fail to capture the complexity of the visual world, and cannot adapt to different domains.…”

Section: Introductionmentioning

confidence: 99%

Automatic Concept Discovery from Parallel Text and Visual Corpora

Sun

Gan

Nevatia

2015

2015 IEEE International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

Humans connect language and vision to perceive the world. How to build a similar connection for computers? One possible way is via visual concepts, which are text terms that relate to visually discriminative entities. We propose an automatic visual concept discovery algorithm using parallel text and visual corpora; it filters text terms based on the visual discriminative power of the associated images, and groups them into concepts using visual and semantic similarities. We illustrate the applications of the discovered concepts using bidirectional image and sentence retrieval task and image tagging task, and show that the discovered concepts not only outperform several large sets of manually selected concepts significantly, but also achieves the stateof-the-art performance in the retrieval task.

show abstract

“…With web data retrieved for a given concept, there will also be common characteristics shared among subsets of data. Therefore, rather than hard clustering data into a specific number of subsets as some approaches which also aim to deal with intra-class variations in concepts [17], [23], we use hierarchy clustering which allows different clusters to share the same instances. We adopt OPTICS ("Ordering Points To Identify the Clustering Structure") [9] to find clusters.…”

Section: Shot Clusteringmentioning

confidence: 99%

Automatic Retrieval of Action Video Shots from the Web Using Density-Based Cluster Analysis and Outlier Detection

Yanai

2016

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYIn this paper, we introduce a fully automatic approach to construct action datasets from noisy Web video search results. The idea is based on combining cluster structure analysis and density-based outlier detection. For a specific action concept, first, we download its Web top search videos and segment them into video shots. We then organize these shots into subsets using density-based hierarchy clustering. For each set, we rank its shots by their outlier degrees which are determined as their isolatedness with respect to their surroundings. Finally, we collect high ranked shots as training data for the action concept. We demonstrate that with action models trained by our data, we can obtain promising precision rates in the task of action classification while offering the advantage of fully automatic, scalable learning. Experiment results on UCF11, a challenging action dataset, show the effectiveness of our method.

show abstract

ConceptMap: Mining Noisy Web Data for Concept Learning

Cited by 18 publications

References 51 publications

Sense discovery via co-clustering on images and text

Sense discovery via co-clustering on images and text

Automatic Concept Discovery from Parallel Text and Visual Corpora

Automatic Retrieval of Action Video Shots from the Web Using Density-Based Cluster Analysis and Outlier Detection

Contact Info

Product

Resources

About