HPILN: a feature learning framework for cross‐modality person re‐identification

Zhao, Yun‐Bo; Lin, Jie; Xuan, Qi; Xi, Xu

doi:10.1049/iet-ipr.2019.0699

Cited by 75 publications

(34 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this subsection, we compare our proposed method with several cross-modality person ReID methods that include the following categories: 1) With different structures and loss functions, Two-Stream, One-Stream, Zero-Padding [39], HSME, D-HSME [11], BDTR, SDL [19], DGD+MSR [7], EDFL [23], HPILN [50], AGW [44], cm-SSFT [25], and TSLFN+HC [54] learned modality-invariant feature representation; 2) With the ideas of GAN, cmGAN [4],…”

Section: Comparison With State-of-the-art Methodsmentioning

confidence: 99%

MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared Person Re-Identification

Gao

Liang

Jin

et al. 2021

Proceedings of the 29th ACM International Conference on Multimedia

View full text Add to dashboard Cite

The RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality. Existing methods mainly use a two-stream architecture to eliminate the discrepancy between the two modalities in the final common feature space, which ignore the single space of each modality in the shallow layers. To solve it, in this paper, we present a novel multi-feature space joint optimization (MSO) network, which can learn modality-sharable features in both the single-modality space and the common space. Firstly, based on the observation that edge information is modality-invariant, we propose an edge features enhancement module to enhance the modality-sharable features in each single-modality space. Specifically, we design a perceptual edge features (PEF) loss after the edge fusion strategy analysis. According to our knowledge, this is the first work that proposes explicit optimization in the singlemodality feature space on cross-modality ReID task. Moreover, to increase the difference between cross-modality distance and class distance, we introduce a novel cross-modality contrastive-center (CMCC) loss into the modality-joint constraints in the common feature space. The PEF loss and CMCC loss jointly optimize the model in an end-to-end manner, which markedly improves the network's performance. Extensive experiments demonstrate that the proposed model significantly outperforms state-of-the-art methods on both the SYSU-MM01 and RegDB datasets. CCS CONCEPTS• Computing methodologies → Visual content-based indexing and retrieval.

show abstract

Section: Comparison With State-of-the-art Methodsmentioning

confidence: 99%

MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared Person Re-Identification

Gao

Liang

Jin

et al. 2021

Proceedings of the 29th ACM International Conference on Multimedia

View full text Add to dashboard Cite

show abstract

“…Compared with hand-crafted methods, deep learning approaches achieved a great improvement in recognition accuracy. However, these learned global representations mainly focuses on full body semantic and pays less attention to local details [ 8 ]. It naturally lacks flexible granularity for feature description and often suffers weak discriminative ability in identifying targets with similar inter-class common properties or large intra-class differences [ 9 ].…”

Section: Related Workmentioning

confidence: 99%

“…Deep neural network is originally developed for image classification [ 7 ], and its successful global feature learning strategy for classification was directly adopted for the person Re-ID approaches. The learned global representation pays less attention to local details [ 8 ], and often suffers weak discriminative ability in identifying targets with similar inter-class common properties or large intra-class differences [ 9 ]. For example, the following difficulties are encountered: (1) imprecise pedestrian detection affects global feature learning, e.g., shown in Figure 1 a; (2) body posture changes make the learning more difficult, e.g., Figure 1 b; (3) unexpected occlusion makes the learned features irrelevant to the human bodies, e.g., Figure 1 c; (4) cluttered background or multiple pedestrians with highly similar appearances make the model difficult to distinguish, e.g., Figure 1 d,e; (5) Misaligned bounding boxes make the model scale-variant, e.g., Figure 1 f.…”

Section: Introductionmentioning

confidence: 99%

EXAM: A Framework of Learning Extreme and Moderate Embeddings for Person Re-ID

Wang

et al. 2021

J. Imaging

View full text Add to dashboard Cite

Person re-identification (Re-ID) is challenging due to host of factors: the variety of human positions, difficulties in aligning bounding boxes, and complex backgrounds, among other factors. This paper proposes a new framework called EXAM (EXtreme And Moderate feature embeddings) for Re-ID tasks. This is done using discriminative feature learning, requiring attention-based guidance during training. Here “Extreme” refers to salient human features and “Moderate” refers to common human features. In this framework, these types of embeddings are calculated by global max-pooling and average-pooling operations respectively; and then, jointly supervised by multiple triplet and cross-entropy loss functions. The processes of deducing attention from learned embeddings and discriminative feature learning are incorporated, and benefit from each other in this end-to-end framework. From the comparative experiments and ablation studies, it is shown that the proposed EXAM is effective, and its learned feature representation reaches state-of-the-art performance.

show abstract

“…In the process of extracting features, skip connections are used to fuse the middle layers of the CNN model and enhance the robustness and non-descriptiveness of the extracted features. Zhao et al [27] expanded the triple loss function to pentaplet loss, and the cross-modality problem was considered on the basis of the original triple loss function; additionally, a method for mining difficult samples was introduced. Zhu et al [28] involves feature centers of the same category and the same modality, and hetero center loss was proposed on the basis of center loss, with a focus on the differences among feature centers of different modalities in the same category.…”

Section: Rgb-ir Re-id Based On Cnn Networkmentioning

confidence: 99%

MFCNet: Mining Features Context Network for RGB–IR Person Re-Identification

Mei

et al. 2021

Future Internet

View full text Add to dashboard Cite

RGB–IR cross modality person re-identification (RGB–IR Re-ID) is an important task for video surveillance in poorly illuminated or dark environments. In addition to the common challenge of Re-ID, the large cross-modality variations between RGB and IR images must be considered. The existing RGB–IR Re-ID methods use different network structures to learn the global shared features associated with multi-modalities. However, most global shared feature learning methods are sensitive to background clutter, and contextual feature relationships are not considered among the mined features. To solve these problems, this paper proposes a dual-path attention network architecture MFCNet. SGA (Spatial-Global Attention) module embedded in MFCNet includes spatial attention and global attention branches to mine discriminative features. First, the SGA module proposed in this paper focuses on the key parts of the input image to obtain robust features. Next, the module mines the contextual relationships among features to obtain discriminative features and improve network performance. Finally, extensive experiments demonstrate that the performance of the network architecture proposed in this paper is better than that of state-of-the-art methods under various settings. In the all-search mode of the SYSU and RegDB data sets, the rank-1 accuracy reaches 51.64% and 69.76%, respectively.

show abstract

HPILN: a feature learning framework for cross‐modality person re‐identification

Cited by 75 publications

References 38 publications

MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared Person Re-Identification

MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared Person Re-Identification

EXAM: A Framework of Learning Extreme and Moderate Embeddings for Person Re-ID

MFCNet: Mining Features Context Network for RGB–IR Person Re-Identification

Contact Info

Product

Resources

About