2021
DOI: 10.1609/aaai.v35i10.17076
|View full text |Cite
|
Sign up to set email alerts
|

Deep Mutual Information Maximin for Cross-Modal Clustering

Abstract: Cross-modal clustering (CMC) aims to enhance the clustering performance by exploring complementary information from multiple modalities. However, the performances of existing CMC algorithms are still unsatisfactory due to the conflict of heterogeneous modalities and the high-dimensional non-linear property of individual modality. In this paper, a novel deep mutual information maximin (DMIM) method for cross-modal clustering is proposed to maximally preserve the shared information of multiple modalities while e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
16
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 33 publications
(16 citation statements)
references
References 25 publications
0
16
0
Order By: Relevance
“…In recent years, deep learning architectures have seen widespread adoption in MVC, resulting in the deep MVC subfield. Methods developed within this subfield have shown state-of-the-art clustering performance on several multi-view datasets [1][2][3][4][5][6], largely outperforming traditional, non-deeplearning-based methods [1]. Despite these promising developments, we identify significant drawbacks with the current state of the field.…”
Section: Introductionmentioning
confidence: 91%
See 1 more Smart Citation
“…In recent years, deep learning architectures have seen widespread adoption in MVC, resulting in the deep MVC subfield. Methods developed within this subfield have shown state-of-the-art clustering performance on several multi-view datasets [1][2][3][4][5][6], largely outperforming traditional, non-deeplearning-based methods [1]. Despite these promising developments, we identify significant drawbacks with the current state of the field.…”
Section: Introductionmentioning
confidence: 91%
“…Despite these promising developments, we identify significant drawbacks with the current state of the field. Selfsupervised learning (SSL) is a crucial component in many recent methods for deep MVC [1][2][3][4][5][6]. However, the large number of methods, all with unique components and arguments about how they work, makes it challenging to identify clear directions and trends in the development of new components and methods.…”
Section: Introductionmentioning
confidence: 99%
“…MI is useful in cross-modality data processing tasks as the statistical features are assumed to be identical. It has been applied to tackle many unsupervised learning problems such as cross-modality data retrieval [25], data representations [22,74,17], domain adaptation [42], and cross-modal clustering [40] etc.. A particular case of MI is using MI for measuring a random variable itself: M I(X, X), which is called Entropy.…”
Section: Mutual Informationmentioning
confidence: 99%
“…Wang et al [30] utilized the subgraph-level summary to build an effective mutual information estimator, which was optimized to strengthen the robustness of graph representation. Mao et al [31] explored the shared information across modalities via maximizing the mutual information between them. Schnapp et al [32] selected important features with minimum mutual information with labels.…”
Section: B Representation Learning Based On Mutual Informationmentioning
confidence: 99%