Omni-Scale Feature Learning for Person Re-Identification

Zhou, Kaiyang; Yang, Yongxin; Cavallaro, Andrea; Xiang, Tao

doi:10.1109/iccv.2019.00380

Cited by 734 publications

(406 citation statements)

References 68 publications

Supporting

Mentioning

405

Contrasting

Order By: Relevance

“…We compare the proposed method with 33 recent published works including (1) global feature based methods which aims to learn the global feature from the feature map directly, including PAN [74], DMML [7], DCDS [1], VCFL [30], MVPM [41], LRDNN [79], RB [35], LITM [63], IANet [23], Sphere [14], BNNeck [32], OSNet [78], AANet [46], DG-Net [72], BDB [12], Circle [42], SFT [31], (2) part based methods including PCB+RPP [43], Local [57], HPM [16], CASN [71], AutoReID [34], MGN [49], BHP [20] and Pyramidal [68] which utilize the semantic parts or horizontal stripes to extract part-level feature, and (3) attention based methods including MHAN [3], CAMA [58], SONA [53], CAR [80], SCAL [6], ABD-Net [8], DAAF [10] and RGA [65]. These methods are categorized into 3 types based on different backbones: the ones which employ ResNet-50 directly, the ones which modify ResNet-50 by introducing additional branches, attention subnets or dilated convolution, and the others which don't use ResNet-50.…”

Section: Comparison Resultsmentioning

confidence: 99%

“…We mainly review the former which utilize deep learning to extract the feature. Holistic Features Based Methods Given a backbone C-NN such as ResNet-50 [21] or other network architectures [2,51,71,78], this type of methods learns discriminative holistic features from the feature map directly. Specifically, they aim to learn the features by improving loss functions [9,14,22,31,41,42,50,55,63], improving the training techniques [1,4,12,24,32,35,37,54], adding additional network modules [23,23,51,62], using extra semantic annotations [30,46,47,79] or generating more training samples [17,33,72,76,77].…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Discriminative Spatial Feature Learning for Person Re-Identification

Peng

Huang

Wang

et al. 2020

Proceedings of the 28th ACM International Conference on Multimedia

View full text Add to dashboard Cite

Person re-identification (ReID) aims to match detected pedestrian images from multiple non-overlapping cameras. Most existing methods employ a backbone CNN to extract a vectorized feature representation by performing some global pooling operations (such as global average pooling and global max pooling) on the 3D feature map (i.e., the output of the backbone CNN). Although simple and effective in some situations, the global pooling operation only focuses on the statistical properties and ignores the spatial distribution of the feature map. Hence, it can not distinguish two feature maps when they have similar response values located in totally different positions. To handle this challenge, a novel method is proposed to learn the discriminative spatial features. Firstly, a self-constrained spatial transformer network (SC-STN) is introduced to handle the misalignments caused by detection errors. Then, based on the prior knowledge that the spatial structure of a pedestrian often keeps robust in vertical orientation of images, a novel vertical convolution network (VCN) is proposed to extract the spatial feature in vertical. Extensive experimental evaluations on several benchmarks demonstrate that the proposed method achieves state-of-theart performances by introducing only a few parameters to the backbone.

show abstract

Section: Comparison Resultsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Discriminative Spatial Feature Learning for Person Re-Identification

Peng

Huang

Wang

et al. 2020

Proceedings of the 28th ACM International Conference on Multimedia

View full text Add to dashboard Cite

show abstract

“…By learning jointly on global and local features, it aims to address existing drawbacks. Xie et al [51] proposed PLR-OSNet, which introduces Part-level resolution (PLR) into Omni-Scale Network (OSNet) [52]. It has two branches including both global and local feature representations.…”

Section: Background and Related Work A Person Reid Methodsmentioning

confidence: 99%

An Effective Adversarial Attack on Person Re-Identification in Video Surveillance via Dispersion Reduction

Zheng

Velipasalar

2020

IEEE Access

View full text Add to dashboard Cite

Person re-identification across a network of cameras, with disjoint views, has been studied extensively due to its importance in wide-area video surveillance. This is a challenging task due to several reasons including changes in illumination and target appearance, and variations in camera viewpoint and camera intrinsic parameters. The approaches developed to re-identify a person across different camera views need to address these challenges. More recently, neural network-based methods have been proposed to solve the person re-identification problem across different camera views, achieving state-of-the-art performance. In this paper, we present an effective and generalizable attack model that generates adversarial images of people, and results in very significant drop in the performance of the existing state-of-the-art person reidentification models. The results demonstrate the extreme vulnerability of the existing models to adversarial examples, and draw attention to the potential security risks that might arise due to this in video surveillance. Our proposed attack is developed by decreasing the dispersion of the internal feature map of a neural network to degrade the performance of several different state-of-the-art person re-identification models. We also compare our proposed attack with other state-of-the-art attack models on different person reidentification approaches, and by using four different commonly used benchmark datasets. The experimental results show that our proposed attack outperforms the state-of-art attack models on the best performing person re-identification approaches by a large margin, and results in the most drop in the mean average precision values.

show abstract

“…The heart of pedestrian tracking is consistent reidentification (ReID) of those pedestrians throughout the frames of videos across multiple cameras. Similarly, on the re-identification side, recent methods leverage CNNs to extract unique features among persons [17][18][19][20][21][22][23][24]. The work in [25] learns the spatial and temporal behavior of objects by translating the feature map of the Region of Interest (RoI) into an adaptive body-action unit.…”

Section: Related Work a Pedestrian Detection Re-identificationmentioning

confidence: 99%

REVAMP²T: Real-Time Edge Video Analytics for Multicamera Privacy-Aware Pedestrian Tracking

Neff

Mendieta

Mohan

et al. 2020

IEEE Internet Things J.

View full text Add to dashboard Cite

This article presents REVAMP 2 T, Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking, as an integrated end-to-end IoT system for privacybuilt-in decentralized situational awareness. REVAMP 2 T presents novel algorithmic and system constructs to push deep learning and video analytics next to IoT devices (i.e. video cameras). On the algorithm side, REVAMP 2 T proposes a unified integrated computer vision pipeline for detection, re-identification, and tracking across multiple cameras without the need for storing the streaming data. At the same time, it avoids facial recognition, and tracks and re-identifies pedestrians based on their key features at runtime. On the IoT system side, REVAMP 2 T provides infrastructure to maximize hardware utilization on the edge, orchestrates global communications, and provides system-wide re-identification, without the use of personally identifiable information, for a distributed IoT network. For the results and evaluation, this article also proposes a new metric, Accuracy • Efficiency (AE), for holistic evaluation of IoT systems for real-time video analytics based on accuracy, performance, and power efficiency. REVAMP 2 T outperforms current state-of-the-art by as much as thirteen-fold AE improvement.

show abstract

Omni-Scale Feature Learning for Person Re-Identification

Cited by 734 publications

References 68 publications

Discriminative Spatial Feature Learning for Person Re-Identification

Discriminative Spatial Feature Learning for Person Re-Identification

An Effective Adversarial Attack on Person Re-Identification in Video Surveillance via Dispersion Reduction

REVAMP²T: Real-Time Edge Video Analytics for Multicamera Privacy-Aware Pedestrian Tracking

Contact Info

Product

Resources

About

Omni-Scale Feature Learning for Person Re-Identification

Cited by 734 publications

References 68 publications

Discriminative Spatial Feature Learning for Person Re-Identification

Discriminative Spatial Feature Learning for Person Re-Identification

An Effective Adversarial Attack on Person Re-Identification in Video Surveillance via Dispersion Reduction

REVAMP2T: Real-Time Edge Video Analytics for Multicamera Privacy-Aware Pedestrian Tracking

Contact Info

Product

Resources

About

REVAMP²T: Real-Time Edge Video Analytics for Multicamera Privacy-Aware Pedestrian Tracking