2015 IEEE International Conference on Computer Vision (ICCV) 2015
DOI: 10.1109/iccv.2015.519
|View full text |Cite
|
Sign up to set email alerts
|

Beyond Covariance: Feature Representation with Nonlinear Kernel Matrices

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
117
1

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 85 publications
(118 citation statements)
references
References 24 publications
0
117
1
Order By: Relevance
“…Representations based directly on raw joint positions are widely used due to the simple acquisition from sensors. Although normalization procedures can make human representations partially invariant to view and scale variations, more [210] Moving Pose BoW Lowlv Dict Bloom et al [211] Dynamic Features Conc Lowlv Hand Vemulapalli et al [212] Lie Group Manifold Conc Manif Hand Zhang and Parker [213] BIPOD Stat Body Hand Lv and Nevatia [214] HMM/Adaboost Conc Lowlv Hand Herda et al [215] Quaternions Conc Body Hand Negin et al [216] RDF Kinematic Features Conc Lowlv Unsup Masood et al [217] Logistic Regression Conc Lowlv Hand Meshry et al [218] Angle & Moving Pose BoW Lowlv Unsup Tao and Vidal [219] Moving Poselets BoW Body Dict Eweiwi et al [220] Discriminative Action Features Conc Lowlv Unsup Wang et al [221] Ker-RP Stat Lowlv Hand Salakhutdinov et al [222] HD Models Conc Lowlv Deep sophisticated construction techniques (e.g., deep learning) are typically needed to develop robust human representations. Representations without involving temporal information are suitable to address problems such as pose and gesture recognition.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Representations based directly on raw joint positions are widely used due to the simple acquisition from sensors. Although normalization procedures can make human representations partially invariant to view and scale variations, more [210] Moving Pose BoW Lowlv Dict Bloom et al [211] Dynamic Features Conc Lowlv Hand Vemulapalli et al [212] Lie Group Manifold Conc Manif Hand Zhang and Parker [213] BIPOD Stat Body Hand Lv and Nevatia [214] HMM/Adaboost Conc Lowlv Hand Herda et al [215] Quaternions Conc Body Hand Negin et al [216] RDF Kinematic Features Conc Lowlv Unsup Masood et al [217] Logistic Regression Conc Lowlv Hand Meshry et al [218] Angle & Moving Pose BoW Lowlv Unsup Tao and Vidal [219] Moving Poselets BoW Body Dict Eweiwi et al [220] Discriminative Action Features Conc Lowlv Unsup Wang et al [221] Ker-RP Stat Lowlv Hand Salakhutdinov et al [222] HD Models Conc Lowlv Deep sophisticated construction techniques (e.g., deep learning) are typically needed to develop robust human representations. Representations without involving temporal information are suitable to address problems such as pose and gesture recognition.…”
Section: Discussionmentioning
confidence: 99%
“…An advantage of this statistics-based encoding approach is that the size of the final feature vector is independent of the number of frames. Moreover, Wang et al [221] proposed an open framework by using the kernel matrix over feature dimensions as a generic representation and elevated the covariance representation to the unlimited opportunities.…”
Section: Statistics-based Encodingmentioning
confidence: 99%
See 1 more Smart Citation
“…Wang et al [30] proposed an open framework 2 to use the kernel matrix over feature dimensions as a generic representation. This work uses a non-linear kernel matrix as the representation, but the kernel functions are defined in the Euclidean space and the resulting representation describes similarities between pixels at different locations, as the traditional CovDs [30]. Our work proposes to capture the similarities between sub-image sets that contain more useful information.…”
Section: Comparison With Other Improved Versions Of Traditional Covdsmentioning
confidence: 99%
“…For the comparative experiments with existing descriptors [4,20,30], we first resize all images to 24 × 24 and then use the intensity values to generate their corresponding representations. For our proposed framework, the sub-image sets are obtained by 6 × 6 sliding window with spatial step of 2 pixels for the CG, ETH-80 and MDSD datasets, and spatial step of 3 pixels for the Virus dataset.…”
Section: A Comparison With Existing Descriptorsmentioning
confidence: 99%