2022
DOI: 10.1609/aaai.v36i1.19991
|View full text |Cite
|
Sign up to set email alerts
|

Learning Disentangled Attribute Representations for Robust Pedestrian Attribute Recognition

Abstract: Although various methods have been proposed for pedestrian attribute recognition, most studies follow the same feature learning mechanism, \ie, learning a shared pedestrian image feature to classify multiple attributes. However, this mechanism leads to low-confidence predictions and non-robustness of the model in the inference stage. In this paper, we investigate why this is the case. We mathematically discover that the central cause is that the optimal shared feature cannot maintain high similarities with mul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 18 publications
(13 citation statements)
references
References 25 publications
0
10
0
Order By: Relevance
“…With finetuning, our UniHCP achieves new SOTAs on nine out of the total twelve datasets and on par performance on the rest three datasets, even without task-specific design in architecture or task-specific priors, showing that Uni-HCP extracts complementary knowledge among humancentric tasks. Concretely, Table 4 shows that in the human attribute recognition task, UniHCP significantly surpasses previous SOTA DAFL [29] by +3.79% mA on PA-100K and +1.20% mA on RAPv2 datasets, respectively, which indicates that UniHCP well extracts the shared attribute information among using the output unit of global probabilities in the interpreter. Second, UniHCP also pushes the performance of another important human task, i.e., human parsing, to a new level.…”
Section: In-pretrain Dataset Resultsmentioning
confidence: 89%
See 1 more Smart Citation
“…With finetuning, our UniHCP achieves new SOTAs on nine out of the total twelve datasets and on par performance on the rest three datasets, even without task-specific design in architecture or task-specific priors, showing that Uni-HCP extracts complementary knowledge among humancentric tasks. Concretely, Table 4 shows that in the human attribute recognition task, UniHCP significantly surpasses previous SOTA DAFL [29] by +3.79% mA on PA-100K and +1.20% mA on RAPv2 datasets, respectively, which indicates that UniHCP well extracts the shared attribute information among using the output unit of global probabilities in the interpreter. Second, UniHCP also pushes the performance of another important human task, i.e., human parsing, to a new level.…”
Section: In-pretrain Dataset Resultsmentioning
confidence: 89%
“…The whole training takes 120 hours in total on 88 NVIDIA V100 GPUs. Method PA-100K RAPv2 SSC [28] 81.87 -C-Tran [36] 81.53 -Q2L [60] 80.72 -L2L [46] 82.37 -DAFL [29] 83…”
Section: Pedestrian Detection (6 Datasets)mentioning
confidence: 99%
“…To show the effectiveness of the proposed network, we compare it with several state-of-the-art PAR methods: HPNet [55], VeSPA [59], JRL [60], PGDM [30], MsVAA [37], GRL [31], VAC [15], JLPSL-PAA [61], RA [62], ALM [23], MT-CAS [38], DTM+AWK [63], JLAC [39], Baseline [32], SSC [64], DAFL [65], and VTB [47].…”
Section: Comparative Resultsmentioning
confidence: 99%
“…To demonstrate the effectiveness of the proposed network, we compared it with several competing PAR methods, including: HPNet [33], VeSPA [34], PGDM [5], MsVAA [34], GRL [6], JLPLS-PAA [35], ALM [14], MT-CAS [10], DTM+AWK [36], JLAC [37], Baseline [29], SSC [28], Label2Label [38], DAFL [39], VTB [40] and EAM [12].…”
Section: Comparison With State-of-the-art Methodsmentioning
confidence: 99%