Gait Recognition via Effective Global-Local Feature Representation and Local Temporal Aggregation

Lin, Beibei; Zhang, Shunli; Yu, Xin

doi:10.1109/iccv48922.2021.01438

Cited by 156 publications

(127 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Then, gait patterns are modeled by parameters like lengths of limbs, angles of joints, and relative positions of body parts [3,48]. The model-free methods mainly adopt the silhouettes obtained by background subtraction from video frames [5,9,11,15,16,22,32,46,57,58]. In particular, Han et al proposed to aggregate a sequence of silhouettes into a compact Gait Energy Image (GEI) [11] which was widely used by the following methods [32,46].…”

Section: Related Workmentioning

confidence: 99%

Gait Recognition in the Wild with Dense 3D Representations and A Benchmark

Zheng¹,

Liu²,

Liu³

et al. 2022

Preprint

View full text Add to dashboard Cite

Existing studies for gait recognition are dominated by 2D representations like the silhouette or skeleton of the human body in constrained scenes. However, humans live and walk in the unconstrained 3D space, so projecting the 3D human body onto the 2D plane will discard a lot of crucial information like the viewpoint, shape, and dynamics for gait recognition. Therefore, this paper aims to explore dense 3D representations for gait recognition in the wild, which is a practical yet neglected problem.In particular, we propose a novel framework to explore the 3D Skinned Multi-Person Linear (SMPL) model of the human body for gait recognition, named SMPLGait. Our framework has two elaborately-designed branches of which one extracts appearance features from silhouettes, the other learns knowledge of 3D viewpoints and shapes from the 3D SMPL model. In addition, due to the lack of suitable datasets, we build the first large-scale 3D representationbased gait recognition dataset, named Gait3D. It contains 4,000 subjects and over 25,000 sequences extracted from 39 cameras in an unconstrained indoor scene. More importantly, it provides 3D SMPL models recovered from video frames which can provide dense 3D information of body shape, viewpoint, and dynamics. Based on Gait3D, we comprehensively compare our method with existing gait recognition approaches, which reflects the superior performance of our framework and the potential of 3D representations for gait recognition in the wild. The code and dataset are available at https://gait3d.github.io.

show abstract

Section: Related Workmentioning

confidence: 99%

Gait Recognition in the Wild with Dense 3D Representations and A Benchmark

Zheng¹,

Liu²,

Liu³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…The GEI-based methods [28], [29], [30], [31], [32] greatly compressed computational cost but lost discriminative expression. In contrast, the video-based approaches [8], [10], [11], [12], [13], [25], [26], [33] processed gait sequences frame by frame, which maintained the framelevel discriminative feature in a large extent, and benefited the networks to learn temporal representation. Our approach belongs to appearance-based method and takes silhouette sequences as input.…”

Section: Gait Recognitionmentioning

confidence: 99%

“…LSTM networks were applied in [10], [11] to achieve longshort temporal modeling, which fused temporal clues by temporal accumulation. With the help of stacked 3D blocks, MT3D [12] and GaitGL [13] incorporated temporal information with small and large scales, then concatenated or summed these features as outputs. 3DLocal [34] applied 3D CNN to obtain different local parts, and fused them with feature concatenation.…”

Section: Temporal Modelingmentioning

confidence: 99%

“…(1) Follow the settings in [13], [18], we set the value of B (number of training samples in one iteration) as 64, 256 and 256 on CASIA-B [15], OU-MVLP [16] and GREW [17] datasets respectively. (2) The value of N (input frame number) and K (part division number) are set as 30 and 32.…”

Section: Hyper-parametersmentioning

confidence: 99%

“…At present, multi-layer convolutions have been widely used in current methods to model multi-scale temporal information. And they aggregated multi-scale features in a summation [8], [9], [10], [11] or a concatenation way [12], [13]. However, since the aggregation methods are fixed, these manners are not flexible enough to adapt to variations of complex motion and realistic factors, i.e., self occlusion between body parts and change of camera viewpoints.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Context-Sensitive Temporal Feature Learning for Gait Recognition

Huang

Zhu

Wang

et al. 2021

2021 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

Although gait recognition has drawn increasing research attention recently, it remains challenging to learn discriminative temporal representation, since the silhouette differences are quite subtle in spatial domain. Inspired by the observation that human can distinguish gaits of different subjects by adaptively focusing on temporal clips with different time scales, we propose a context-sensitive temporal feature learning (CSTL) network for gait recognition. CSTL produces temporal features in three scales, and adaptively aggregates them according to the contextual information from local and global perspectives. Specifically, CSTL contains an adaptive temporal aggregation module that subsequently performs local relation modeling and global relation modeling to fuse the multi-scale features. Besides, in order to remedy the spatial feature corruption caused by temporal operations, CSTL incorporates a salient spatial feature learning (SSFL) module to select groups of discriminative spatial features. Particularly, we utilize transformers to implement the global relation modeling and the SSFL module. To the best of our knowledge, this is the first work that adopts transformer in gait recognition. Extensive experiments conducted on three datasets demonstrate the state-of-the-art performance. Concretely, we achieve rank-1 accuracies of 98.7%, 96.2% and 88.7% under normal-walking, bag-carrying and coat-wearing conditions on CASIA-B, 97.5% on OU-MVLP and 50.6% on GREW.

show abstract

A benchmark for clothes variation in person re‐identification

Wang

Chen

et al. 2020

Int J Intell Syst

View full text Add to dashboard Cite

Person re-identification (re-ID) has drawn attention significantly in the computer vision society due to its application and research significance. It aims to retrieve a person of interest across different camera views. However, there are still several factors that hinder the applications of person re-ID. In fact, most common data sets either assume that pedestrians do not change their clothing across different camera views or are taken under constrained environments. Those constraints simplify the person re-ID task and contribute to early development of person re-ID, yet a person has a great possibility to change clothes in real life. To facilitate the research toward conquering those issues, this paper mainly introduces a new benchmark data set for person reidentification. To the best of our knowledge, this data set is currently the most diverse for person re-identification. It contains 107 persons with 9,738 images, captured in 15 indoor/outdoor scenes from September 2019 to December 2019, varying according to viewpoints, lighting, resolutions, human pose, seasons, backgrounds, and clothes especially. We hope that this benchmark data set will encourage further research on person re-identification with clothes variation. Moreover, we also perform extensive analyses on this data set using several

show abstract

Gait Recognition via Effective Global-Local Feature Representation and Local Temporal Aggregation

Cited by 156 publications

References 25 publications

Gait Recognition in the Wild with Dense 3D Representations and A Benchmark

Gait Recognition in the Wild with Dense 3D Representations and A Benchmark

Context-Sensitive Temporal Feature Learning for Gait Recognition

A benchmark for clothes variation in person re‐identification

Contact Info

Product

Resources

About