Proceedings of the 28th ACM International Conference on Multimedia 2020
DOI: 10.1145/3394171.3413962
|View full text |Cite
|
Sign up to set email alerts
|

Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval

Abstract: Cross-modal hashing has attracted much attention in the large-scale multimedia search area. In many real applications, labels of samples have hierarchical structure which also contains much useful information for learning. However, most existing methods are originally designed for non-hierarchical labeled data and thus fail to exploit the rich information of the label hierarchy. In this paper, we propose an effective cross-modal hashing method, named Supervised Hierarchical Deep Cross-modal Hashing, SHDCH for … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 38 publications
(12 citation statements)
references
References 34 publications
0
10
0
Order By: Relevance
“…e Input module, Generalization module, Output module, and Response module can use any existing algorithm in the field of machine learning, such as SVM and random forest [25]. e working process of each of the four modules is introduced, respectively.…”
Section: Memory Networkmentioning
confidence: 99%
“…e Input module, Generalization module, Output module, and Response module can use any existing algorithm in the field of machine learning, such as SVM and random forest [25]. e working process of each of the four modules is introduced, respectively.…”
Section: Memory Networkmentioning
confidence: 99%
“…Therefore, some people use text or other modality queries to express the search intention and the task of cross-modal retrieval (especially using text to retrieve images) has emerged. Cross-modal retrieval focuses on mapping different modalities to the same semantic space, and uses supervision information to guide the alignment of images and texts [6] [17] [26] [33] [15] [34] [37] . However, the information conveyed by the text is very abstract and sparse, which makes cross-modal retrieval very difficult and the application scenarios are not extensive.…”
Section: Related Work 21 Image Retrievalmentioning
confidence: 99%
“…Deep Saliency Hashing (DSaH) [46] is a two-step end-to-end model, which mines salient regions and learns semantic-preserving hash codes simultaneously. Supervised Hierarchical Deep Cross-modal Hashing (SHDCH) [47] learns the hash codes by explicitly delving into the hierarchical labels. Deep Semantic cross-modal hashing with Correlation Alignment (DSCA) [48] designs two deep neural networks for image and text modality separately, and learns two hash functions.…”
Section: Deep Hashingmentioning
confidence: 99%