A Large-Scale Database of Images and Captions for Automatic Face Naming

Özcan, Mert; Luo, Jiancheng; Ferrari, Vittorio; Caputo, Barbara

doi:10.5244/c.25.29

Cited by 11 publications

(21 citation statements)

References 16 publications

(34 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…2) Results on the real-world datasets : For performance evaluation, we follow [37] to take the accuracy and precision as two criteria. The accuracy is the percentage of correctly annotated faces (also including the correctly annotated faces whose groundtruth name is the "null" name) over all faces, while the precision is the percentage of correctly annotated faces over the faces which are annotated as real names (i.e., we do not consider the faces annotated as the "null" class by a face naming method).…”

Section: ) Results On the Synthetic Datasetmentioning

confidence: 99%

Automatic Face Naming by Learning Discriminative Affinity Matrices From Weakly Labeled Images

Xiao

2015

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

Abstract-Given a collection of images, where each image contains several faces and is associated with a few names in the corresponding caption, the goal of face naming is to infer the correct name for each face. In this work, we propose two new methods to effectively solve this problem by learning two discriminative affinity matrices from these weakly labeled images. We first propose a new method called regularized low-rank representation (rLRR), by effectively utilizing weakly supervised information to learn a low-rank reconstruction coefficient matrix while exploring multiple subspace structures of the data. Specifically, by introducing a specially designed regularizer to the low-rank representation (LRR) method, we penalize the corresponding reconstruction coefficients related to the situations where a face is reconstructed by using face images from other subjects or by using itself. With the inferred reconstruction coefficient matrix, a discriminative affinity matrix can be obtained. Moreover, we also develop a new distance metric learning method called Ambiguously-supervised Structural Metric Learning (ASML) by using weakly supervised information to seek a discriminative distance metric. Hence, another discriminative affinity matrix can be obtained by using the similarity matrix (i.e., the kernel matrix) based on the Mahalanobis distances of the data. Observing that these two affinity matrices contain complementary information, we further combine them to obtain a fused affinity matrix, based on which we develop a new iterative scheme to infer the name of each face. Comprehensive experiments demonstrate the effectiveness of our approach.

show abstract

Section: ) Results On the Synthetic Datasetmentioning

confidence: 99%

Automatic Face Naming by Learning Discriminative Affinity Matrices From Weakly Labeled Images

Xiao

2015

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

show abstract

“…Our models are trained over web images queried from Bing Image search engine for the same names. All the data preprocessing and the feature extraction flow follow the same line of [41], that is owned from [12]. However, [41] trains the models and evaluates the results at the same collection.…”

Section: Learning Facesmentioning

confidence: 99%

“…We use FAN-large [41] face data-set for testing our method in face recognition problem. We use Easy and Hard subsets with the names accommodating more than 100 images (to have fair testing results).…”

Section: Learning Facesmentioning

confidence: 99%

ConceptMap: Mining Noisy Web Data for Concept Learning

Golge

Duygulu

2014

Computer Vision – ECCV 2014

View full text Add to dashboard Cite

Abstract. We attack the problem of learning concepts automatically from noisy Web image search results. The idea is based on discovering common characteristics shared among subsets of images by posing a method that is able to organise the data while eliminating irrelevant instances. We propose a novel clustering and outlier detection method, namely Concept Map (CMAP). Given an image collection returned for a concept query, CMAP provides clusters pruned from outliers. Each cluster is used to train a model representing a different characteristics of the concept. The proposed method outperforms the state-of-the-art studies on the task of learning from noisy web data for low-level attributes, as well as high level object categories. It is also competitive with the supervised methods in learning scene concepts. Moreover, results on naming faces support the generalisation capability of the CMAP framework to different domains. CMAP is capable to work at large scale with no supervision through exploiting the available sources.

show abstract

“…news documents (6.9M words). The attachment probabilities (see (24)) were estimated from the same corpus. We tuned the caption length parameter on the development set using a range of ½5; 14 tokens for the word-based model and ½2; 5 phrases for the phrase-based model.…”

Section: Parameter Tuningmentioning

confidence: 99%

“…Examples include associating names mentioned in the captions to faces depicted in news images (e.g., [23], [24]), verbs to body poses [25], and learning models for recognizing objects [26] and their relative importance [27].…”

Section: Related Workmentioning

confidence: 99%

Automatic Caption Generation for News Images

Feng

Lapata

2013

IEEE Trans. Pattern Anal. Mach. Intell.

112

View full text Add to dashboard Cite

Abstract-This paper is concerned with the task of automatically generating captions for images, which is important for many imagerelated applications. Examples include video and image retrieval as well as the development of tools that aid visually impaired individuals to access pictorial information. Our approach leverages the vast resource of pictures available on the web and the fact that many of them are captioned and colocated with thematically related documents. Our model learns to create captions from a database of news articles, the pictures embedded in them, and their captions, and consists of two stages. Content selection identifies what the image and accompanying article are about, whereas surface realization determines how to verbalize the chosen content. We approximate content selection with a probabilistic image annotation model that suggests keywords for an image. The model postulates that images and their textual descriptions are generated by a shared set of latent variables (topics) and is trained on a weakly labeled dataset (which treats the captions and associated news articles as image labels). Inspired by recent work in summarization, we propose extractive and abstractive surface realization models. Experimental results show that it is viable to generate captions that are pertinent to the specific content of an image and its associated article, while permitting creativity in the description. Indeed, the output of our abstractive model compares favorably to handwritten captions and is often superior to extractive methods.

show abstract

A Large-Scale Database of Images and Captions for Automatic Face Naming

Cited by 11 publications

References 16 publications

Automatic Face Naming by Learning Discriminative Affinity Matrices From Weakly Labeled Images

Automatic Face Naming by Learning Discriminative Affinity Matrices From Weakly Labeled Images

ConceptMap: Mining Noisy Web Data for Concept Learning

Automatic Caption Generation for News Images

Contact Info

Product

Resources

About