Towards a Unified Middle Modality Learning for Visible-Infrared Person Re-Identification

Zhang, Yukang; Yan, Yan; Lü, Yang; Wang, Hanzi

doi:10.1145/3474085.3475250

Cited by 72 publications

(30 citation statements)

References 83 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[44] propose a bi-directional top-ranking loss, which samples positive and negative pairs from different modalities and optimizes such cross-modality triplets with a bi-directional interactive iteration manner. More recently, some other works adopt adversarial training strategies to reduce the cross-modality distribution divergence in image-level [29], [30], [32], [36], [46], [49]. For a instance, they transfer stylistic properties of visible images to their infrared counterpart, with an identitypreserving constraint [30], [32] or cycle consistency [29], [36].…”

Section: B Visible-infrared Re-id Methodsmentioning

confidence: 99%

Towards Homogeneous Modality Learning and Multi-Granularity Information Exploration for Visible-Infrared Person Re-Identification

Liu¹,

Xia²,

Jiang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Section: B Visible-infrared Re-id Methodsmentioning

confidence: 99%

Towards Homogeneous Modality Learning and Multi-Granularity Information Exploration for Visible-Infrared Person Re-Identification

Liu¹,

Xia²,

Jiang³

et al. 2022

Preprint

View full text Add to dashboard Cite

“…Apart from modality-translation-based Re-ID approaches, there are a few attempts [ 95 , 96 , 97 , 98 ] that introduce a third modality to reduce the modality discrepancy. The idea of using a third modality was proposed by Li et al [ 95 ], who introduced an “X” modality as a middle modality to eliminate cross-modal discrepancies.…”

Section: Cross-modal Person Re-identificationmentioning

confidence: 99%

“…Following the same pipeline, in [ 96 , 97 ], real images from both modalities were combined with ground-truth labels to generate third-modal images, which help to reduce modality-related biases. In [ 98 ], a non-linear middle modality generator was proposed that effectively projects images from both modalities onto a unified space to generate an additional modality to reduce the modality discrepancies.…”

Section: Cross-modal Person Re-identificationmentioning

confidence: 99%

Person Re-Identification with RGB–D and RGB–IR Sensors: A Comprehensive Survey

Uddin

Bhuiyan

Bappee

et al. 2023

Sensors

View full text Add to dashboard Cite

Learning about appearance embedding is of great importance for a variety of different computer-vision applications, which has prompted a surge in person re-identification (Re-ID) papers. The aim of these papers has been to identify an individual over a set of non-overlapping cameras. Despite recent advances in RGB–RGB Re-ID approaches with deep-learning architectures, the approach fails to consistently work well when there are low resolutions in dark conditions. The introduction of different sensors (i.e., RGB–D and infrared (IR)) enables the capture of appearances even in dark conditions. Recently, a lot of research has been dedicated to addressing the issue of finding appearance embedding in dark conditions using different advanced camera sensors. In this paper, we give a comprehensive overview of existing Re-ID approaches that utilize the additional information from different sensor-based methods to address the constraints faced by RGB camera-based person Re-ID systems. Although there are a number of survey papers that consider either the RGB–RGB or Visible-IR scenarios, there are none that consider both RGB–D and RGB–IR. In this paper, we present a detailed taxonomy of the existing approaches along with the existing RGB–D and RGB–IR person Re-ID datasets. Then, we summarize the performance of state-of-the-art methods on several representative RGB–D and RGB–IR datasets. Finally, future directions and current issues are considered for improving the different sensor-based person Re-ID systems.

show abstract

“…The most popular architecture [ 5 , 35 , 48 ] is a double-stream deep network, where shallow layers are independent for learning modal-specific features and deep layers are shared for learning modal-common features. Some researchers improved the double-stream architecture via fine part alignment designs [ 40 , 49 ], attention mechanisms [ 35 , 36 ], or new neural structures, such as graph [ 27 ] and transformer [ 32 , 50 ].…”

Section: Related Workmentioning

confidence: 99%

Margin-Based Modal Adaptive Learning for Visible-Infrared Person Re-Identification

Zhao

Zhu

2023

Sensors

View full text Add to dashboard Cite

Visible-infrared person re-identification (VIPR) has great potential for intelligent transportation systems for constructing smart cities, but it is challenging to utilize due to the huge modal discrepancy between visible and infrared images. Although visible and infrared data can appear to be two domains, VIPR is not identical to domain adaptation as it can massively eliminate modal discrepancies. Because VIPR has complete identity information on both visible and infrared modalities, once the domain adaption is overemphasized, the discriminative appearance information on the visible and infrared domains would drain. For that, we propose a novel margin-based modal adaptive learning (MMAL) method for VIPR in this paper. On each domain, we apply triplet and label smoothing cross-entropy functions to learn appearance-discriminative features. Between the two domains, we design a simple yet effective marginal maximum mean discrepancy (M3D) loss function to avoid an excessive suppression of modal discrepancies to protect the features’ discriminative ability on each domain. As a result, our MMAL method could learn modal-invariant yet appearance-discriminative features for improving VIPR. The experimental results show that our MMAL method acquires state-of-the-art VIPR performance, e.g., on the RegDB dataset in the visible-to-infrared retrieval mode, the rank-1 accuracy is 93.24% and the mean average precision is 83.77%.

show abstract

Towards a Unified Middle Modality Learning for Visible-Infrared Person Re-Identification

Cited by 72 publications

References 83 publications

Towards Homogeneous Modality Learning and Multi-Granularity Information Exploration for Visible-Infrared Person Re-Identification

Towards Homogeneous Modality Learning and Multi-Granularity Information Exploration for Visible-Infrared Person Re-Identification

Person Re-Identification with RGB–D and RGB–IR Sensors: A Comprehensive Survey

Margin-Based Modal Adaptive Learning for Visible-Infrared Person Re-Identification

Contact Info

Product

Resources

About