2019 IEEE International Conference on Multimedia and Expo (ICME) 2019
DOI: 10.1109/icme.2019.00092
|View full text |Cite
|
Sign up to set email alerts
|

A Mask Based Deep Ranking Neural Network for Person Retrieval

Abstract: Person retrieval faces many challenges including cluttered background, appearance variations (e.g., illumination, pose, occlusion) among different camera views and the similarity among different person's images. To address these issues, we put forward a novel mask based deep ranking neural network with a skipped fusing layer. Firstly, to alleviate the problem of cluttered background, masked images with only the foreground regions are incorporated as input in the proposed neural network. Secondly, to reduce the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
32
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 47 publications
(32 citation statements)
references
References 20 publications
0
32
0
Order By: Relevance
“…In [42], body poses/parts are first detected and deep neural networks are designed for representation learning on both the local parts and global region. Some works rely on constrained attention selection mechanisms from human mask/part/pose to implicitly calibrate misaligned images [32,25,45,14,34].…”
Section: Related Workmentioning
confidence: 99%
“…In [42], body poses/parts are first detected and deep neural networks are designed for representation learning on both the local parts and global region. Some works rely on constrained attention selection mechanisms from human mask/part/pose to implicitly calibrate misaligned images [32,25,45,14,34].…”
Section: Related Workmentioning
confidence: 99%
“…Song et al [32] use the source image and the corresponding binary segmentation mask as inputs to extract discriminative features that are invariant to background clutter. Qi et al [24] adopt both the source image and the masked image as the network inputs, where a multi-layer fusion scheme and a ranking loss are developed to fuse the different levels of features and optimize the network, respectively. The mask-guided methods can extract aligned local features and focus on foreground areas by exploiting the results from semantic segmentation.…”
Section: A General Person Re-id Methodsmentioning
confidence: 99%
“…For example, Song et al [20] proposed a mask-guided background features and pulling body features closer to the full image. Qi et al [21] used the mask image together with the raw image as inputs and generated fusing features from different levels. Although these methods used mask information, they did not pay enough attention to masks.…”
Section: Related Workmentioning
confidence: 99%
“…Thus, combining segmentation and person ReID becomes a new way to obtain body regions explicitly. Qi et al [21] designed two branches, which use both raw and masked images as inputs, while Song et al [20] concatenated them to become a single image. However, due to huge difference between segmentation and ReID datasets in resolution, image size, and object classes, body mask generating faces many challenges.…”
Section: Related Workmentioning
confidence: 99%