2020
DOI: 10.20944/preprints202001.0288.v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Adversarial Learning Based Semantic Correlation Representation for Cross-Modal Retrieval

Abstract: With the rapid development of Internet and the widely usage of smart devices, massive multimedia data are generated, collected, stored and shared on the Internet. This trend makes cross-modal retrieval problem become a hot issue in this years. Many existing works pay attentions on correlation learning to generate a common subspace for cross-modal correlation measurement, and others uses adversarial learning technique to abate the heterogeneity of multi-modal data. However, very few works combine correlation le… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 40 publications
0
8
0
Order By: Relevance
“…So, a modality-specific and shared generative adversarial network (MS 2 GAN) approach is proposed in [18] which incorporates two separate sub-networks and a common subnetwork for learning modality-specific and modality-shared features respectively. [19] has introduced a novel end-to-end framework known as adversarial learning based semantic correlation representation (ALSCOR) framework which combines cross-modal representation learning, adversarial, and correlation learning. Non-linear correlation is captured by integrating the CCA model with TxtNet and VisNet representation models.…”
Section: Generative Adversarial Networkmentioning
confidence: 99%
See 1 more Smart Citation
“…So, a modality-specific and shared generative adversarial network (MS 2 GAN) approach is proposed in [18] which incorporates two separate sub-networks and a common subnetwork for learning modality-specific and modality-shared features respectively. [19] has introduced a novel end-to-end framework known as adversarial learning based semantic correlation representation (ALSCOR) framework which combines cross-modal representation learning, adversarial, and correlation learning. Non-linear correlation is captured by integrating the CCA model with TxtNet and VisNet representation models.…”
Section: Generative Adversarial Networkmentioning
confidence: 99%
“…Table (5) shows the I2T, T2I and their average MAP score values for respective dataset categories. Figure (19) demonstrates a curve depicting the precision values obtained for each test query (image in case of I2T and text in case of T2I operation) in a sorted manner and the change in precision values as per the queries can be visualized. Figure (20) illustrates a few matched images and text results retrieved using an image query on trained Proposed2 model.…”
Section: Parameter Settingsmentioning
confidence: 99%
“…As a hot issue widely concerned, cross-modal retrieval problem is studied by a growing number of researchers [4,5,25,29,35,40,50]. According to the representation type of multimedia instances, cross-modal retrieval can be divided into two groups: real-valued representation based retrieval and binary representation (hash code) based retrieval.…”
Section: Related Workmentioning
confidence: 99%
“…Building embeddings for different modalities in a common semantic space has been another popular way over the past few years. is method allows the model to compute cross-modal similarity, which can be further used for downstream tasks, such as cross-media retrieval [40][41][42]. Ba et al [43] presented a model that can classify unseen categories from their textual description by cross-modal similarity in Zero-Shot Learning (ZSL).…”
Section: Cross-modal Representationmentioning
confidence: 99%