“…1) is important as it helps understand e.g. human-object interactions [7,6,3,1] and perform robotic Discriminative methods based on convolutional neural networks (CNNs) have shown very promising performance in estimating 3D hand poses either from RGB images [43,68,4,14,29,46] or depth maps [65,30,50,58,30,64,28,38,64,2]. However, the predictions are based on coarse skeletal representations, and no explicit kinematics and geometric mesh constraints are often considered.…”