SHPR-Net: Deep Semantic Hand Pose Regression From Point Clouds

Chen, Xinghao; Wang, Guijin; Zhang, Cai‐Rong; Kim, Taekyun; Ji, Xiangyang

doi:10.1109/access.2018.2863540

Cited by 82 publications

(64 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the current work, we learn encoders for RGB images and point clouds and decoders for 3D hand poses, point clouds and heat maps of the 2D hand key points on the RGB image. We choose to convert the 2.5D depth information as 3D point clouds instead of standard depth maps, due to its superior performance in hand pose estimation, as shown in previous works [10,4,6]. Heat maps are chosen as a third modality for decoding to encourage convergence of the RGB encoder, since the heat maps are closely related to activation areas on the RGB images.…”

Section: Encoder and Decoder Modulesmentioning

confidence: 99%

Aligning Latent Spaces for 3D Hand Pose Estimation

Yang

Lee

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

Hand pose estimation from monocular RGB inputs is a highly challenging task. Many previous works for monocular settings only used RGB information for training despite the availability of corresponding data in other modalities such as depth maps. In this work, we propose to learn a joint latent representation that leverages other modalities as weak labels to improve RGB-based hand pose estimation. By design, our architecture is highly flexible in embedding various diverse modalities such as heat maps, depth maps and point clouds. In particular, we find that encoding and decoding the point cloud of the hand surface can improve the quality of the joint latent representations. Experiments show that with the aid of other modalities during training, our proposed method boosts the accuracy of RGB-based hand pose estimation systems and significantly outperforms state-of-the-art on two public benchmarks.

show abstract

Section: Encoder and Decoder Modulesmentioning

confidence: 99%

Aligning Latent Spaces for 3D Hand Pose Estimation

Yang

Lee

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

show abstract

“…Their method achieved satisfying performance, but tedious pre-processing steps are required, which includes oriented bounding box (OBB) calculation, surface normal estimation and k-nearest-neighbours search for all points. Chen et al improves Ge's method by using a spatial transformer network to replace the OBB and furthermore added a auxiliary hand segmentation task to improve the performance [3]. Their method can be trained end-to-end without OBB, but the segmentation ground-truth data require a extra precomputation step from the pose data.…”

Section: Deep Learning For Hand Pose Estimationmentioning

confidence: 99%

Point-To-Pose Voting Based Hand Pose Estimation Using Residual Permutation Equivariant Layer

Lee

2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Recently, 3D input data based hand pose estimation methods have shown state-of-the-art performance, because 3D data capture more spatial information than the depth image. Whereas 3D voxel-based methods need a large amount of memory, PointNet based methods need tedious preprocessing steps such as K-nearest neighbour search for each point. In this paper, we present a novel deep learning hand pose estimation method for an unordered point cloud. Our method takes 1024 3D points as input and does not require additional information. We use Permutation Equivariant Layer (PEL) as the basic element, where a residual network version of PEL is proposed for the hand pose estimation task. Furthermore, we propose a votingbased scheme to merge information from individual points to the final pose output. In addition to the pose estimation task, the voting-based scheme can also provide point cloud segmentation result without ground-truth for segmentation. We evaluate our method on both NYU dataset and the Hands2017Challenge dataset. Our method outperforms recent state-of-the-art methods, where our pose accuracy is currently the best for the Hands2017Challenge dataset.

show abstract

“…As a general trend, ever deeper and more sophisticated neural network architectures are dominating hand pose estimation methods. They are highly accurate [4,5,9,11,12,20,23,31,50] when trained with large amounts of labeled samples. However, given that accurate 3D annotations are extremely difficult to obtain, a number of works approach the problem with deep generative models to leverage unlabelled data [1,3,21,28,29,36,49].…”

Section: Related Workmentioning

confidence: 99%

Self-Supervised 3D Hand Pose Estimation Through Training by Fitting

Wan

Probst

Gool

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

We present a self-supervision method for 3D hand pose estimation from depth maps. We begin with a neural network initialized with synthesized data and fine-tune it on real but unlabelled depth maps by minimizing a set of datafitting terms. By approximating the hand surface with a set of spheres, we design a differentiable hand renderer to align estimates by comparing the rendered and input depth maps. In addition, we place a set of priors including a data-driven term to further regulate the estimate's kinematic feasibility. Our method makes highly accurate estimates comparable to current supervised methods which require large amounts of labelled training samples, thereby advancing state-of-theart in unsupervised learning for hand pose estimation.

show abstract

SHPR-Net: Deep Semantic Hand Pose Regression From Point Clouds

Cited by 82 publications

References 37 publications

Aligning Latent Spaces for 3D Hand Pose Estimation

Aligning Latent Spaces for 3D Hand Pose Estimation

Point-To-Pose Voting Based Hand Pose Estimation Using Residual Permutation Equivariant Layer

Self-Supervised 3D Hand Pose Estimation Through Training by Fitting

Contact Info

Product

Resources

About