2020
DOI: 10.3390/electronics9030466
|View full text |Cite
|
Sign up to set email alerts
|

Deep Multi-Modal Metric Learning with Multi-Scale Correlation for Image-Text Retrieval

Abstract: Multi-modal retrieval is a challenge due to heterogeneous gap and a complex semantic relationship between different modal data. Typical research map different modalities into a common subspace with a one-to-one correspondence or similarity/dissimilarity relationship of inter-modal data, in which the distances of heterogeneous data can be compared directly; thus, inter-modal retrieval can be achieved by the nearest neighboring search. However, most of them ignore intra-modal relations and complicated semantics … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 30 publications
0
1
0
Order By: Relevance
“…The parameters of the first convolutional layers are frozen, and the rest of the parameters should be fine-tuned on our self-built dataset. Finally, for the problem of small differences between subclasses and large differences within classes, a loss function based on metric learning [9] is introduced, which is suitable for multidimensional targets. It can target diverse dimensions and enrich the feature information of surface targets to make the neural network converge better and faster.…”
Section: Introductionmentioning
confidence: 99%
“…The parameters of the first convolutional layers are frozen, and the rest of the parameters should be fine-tuned on our self-built dataset. Finally, for the problem of small differences between subclasses and large differences within classes, a loss function based on metric learning [9] is introduced, which is suitable for multidimensional targets. It can target diverse dimensions and enrich the feature information of surface targets to make the neural network converge better and faster.…”
Section: Introductionmentioning
confidence: 99%