Combining image captions and visual analysis for image concept classification

Kliegr, Tomáš; Chandramouli, Krishna; Nemrava, Jan; Svátek, Vojtěch; Izquierdo, Ebroul

doi:10.1145/1509212.1509214

Cited by 26 publications

(19 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These are extensions of the classic unimodal systems, where a common retrieval system integrates information from various modalities. This can be done by fusing features from different modalities into a single vector [37], [38], [39], or by learning different models for different modalities and fusing their predictions [40], [41]. One popular approach is to concatenate features from different modalities and rely on unsupervised structure discovery algorithms, such as latent semantic analysis, to find multimodal statistical regularities.…”

Section: Previous Workmentioning

confidence: 99%

On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval

Pereira

Coviello

Doyle

et al. 2014

IEEE Trans. Pattern Anal. Mach. Intell.

402

213

View full text Add to dashboard Cite

Abstract-The problem of cross-modal retrieval from multimedia repositories is considered. This problem addresses the design of retrieval systems that support queries across content modalities, for example, using an image to search for texts. A mathematical formulation is proposed, equating the design of cross-modal retrieval systems to that of isomorphic feature spaces for different content modalities. Two hypotheses are then investigated regarding the fundamental attributes of these spaces. The first is that low-level cross-modal correlations should be accounted for. The second is that the space should enable semantic abstraction. Three new solutions to the cross-modal retrieval problem are then derived from these hypotheses: correlation matching (CM), an unsupervised method which models cross-modal correlations, semantic matching (SM), a supervised technique that relies on semantic representation, and semantic correlation matching (SCM), which combines both. An extensive evaluation of retrieval performance is conducted to test the validity of the hypotheses. All approaches are shown successful for text retrieval in response to image queries and vice versa. It is concluded that both hypotheses hold, in a complementary form, although evidence in favor of the abstraction hypothesis is stronger than that for correlation.

show abstract

Section: Previous Workmentioning

confidence: 99%

On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval

Pereira

Coviello

Doyle

et al. 2014

IEEE Trans. Pattern Anal. Mach. Intell.

402

213

View full text Add to dashboard Cite

show abstract

“…In essence, the entity names and their types are described as vectors with the specified features. In Semantic Concept Mapping, with the known list of candidate entity names and labels are denoted as WordNet synsets [19]. The Lin's similarity function describes the type of entity name [25].…”

Section: Related Workmentioning

confidence: 99%

Semi-Supervised Bootstrapping Approach for Named Entity Recognition

Thenmalar¹,

Balaji²,

Geetha³

2015

IJNLC

View full text Add to dashboard Cite

show abstract

“…(19) Given two datasets A and B with target paired values of labeled data samples, the solutions to f and g of Equation 14 and Equation 19 can be used to estimate coordinates of the other unlabeled data points in intermediate spaces, which further can be utilized to align their intrinsic data manifolds.…”

Section: Parallel Field Alignment Retrievalmentioning

confidence: 99%

“…To address the cross media retrieval problem, advances have been reported over the last decades [7,26,28]. These methods focus on two traditional ways to design cross media retrieval systems: (a) fusing features from different media data into a single vector [23,33]; (b) learning different models for different media data and fusing their outputs [14,32]. And most of these approaches require multiple-type queries, e.g., queries composed of both image and text features.…”

Section: Introductionmentioning

confidence: 99%

Parallel field alignment for cross media retrieval

Mao

Lin

Cai

et al. 2013

Proceedings of the 21st ACM International Conference on Multimedia

View full text Add to dashboard Cite

Cross media retrieval systems have received increasing interest in recent years. Due to the semantic gap between lowlevel features and high-level semantic concepts of multimedia data, many researchers have explored joint-model techniques in cross media retrieval systems. Previous joint-model approaches usually focus on two traditional ways to design cross media retrieval systems: (a) fusing features from different media data; (b) learning different models for different media data and fusing their outputs. However, the process of fusing features or outputs will lose both low-and highlevel abstraction information of media data. Hence, both ways do not really reveal the semantic correlations among the heterogeneous multimedia data. In this paper, we introduce a novel method for the cross media retrieval task, named Parallel Field Alignment Retrieval (PFAR), which integrates a manifold alignment framework from the perspective of vector fields. Instead of fusing original features or outputs, we consider the cross media retrieval as a manifold alignment problem using parallel fields. The proposed manifold alignment algorithm can effectively preserve the metric of data manifolds, model heterogeneous media data and project their relationship into intermediate latent semantic spaces during the process of manifold alignment. After the alignment, the semantic correlations are also determined. In this way, the cross media retrieval task can be resolved by the determined semantic correlations. Comprehensive experimental results have demonstrated the effectiveness of our approach.

show abstract

Combining image captions and visual analysis for image concept classification

Cited by 26 publications

References 20 publications

On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval

On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval

Semi-Supervised Bootstrapping Approach for Named Entity Recognition

Parallel field alignment for cross media retrieval

Contact Info

Product

Resources

About