This work addresses a novel and challenging problem of estimating the full 3D hand shape and pose from a single RGB image. Most current methods in 3D hand analysis from monocular RGB images only focus on estimating the 3D locations of hand keypoints, which cannot fully express the 3D shape of hand. In contrast, we propose a Graph Convolutional Neural Network (Graph CNN) based method to reconstruct a full 3D mesh of hand surface that contains richer information of both 3D hand shape and pose. To train networks with full supervision, we create a large-scale synthetic dataset containing both ground truth 3D meshes and 3D poses. When fine-tuning the networks on real-world datasets without 3D ground truth, we propose a weakly-supervised approach by leveraging the depth map as a weak supervision in training. Through extensive evaluations on our proposed new datasets and two public datasets, we show that our proposed method can produce accurate and reasonable 3D hand mesh, and can achieve superior 3D hand pose estimation accuracy when compared with state-of-the-art methods.
Using machine vision to identify and sort scattered regular targets is an urgent problem to be solved in automated production lines. This study proposed a three-dimensional (3D) recognition method combining monocular vision and machine learning algorithms. According to the color characteristics of the targets, to convert the original color picture into YC b C r mode and use the 2D Otsu algorithm to perform gray level image segmentation on the C b channel. Then the Haar-feature training was carried out. The comparison of feature training and Haar method for Hough transform showed that the recognized time of Haar-feature AdaBoost trainer reached 31.00 ms, while its false recognized rate was 3.91%. The strong classifier was formed by weight combination, and the Hough contour transformation algorithm was set to correct the normal vector between plane coordinate and camera coordinate system. The monocular vision system ensured that the field of camera view had not obstructed while the dots were being struck. It was measured and calculated angles between targets and the horizontal plane which coordinate points of the identified plane feature. The testing results were compared with the Otsu and AdaBoost trainer where the prediction and training set have an error of no more than 0.25 mm. Its correct rate can reach 95%. It shows that the Otsu and Haar-feature based on AdaBoost algorithm is feasible within a certain error ranges and meet the engineering requirements for solving the poses of automated regular three-dimensional targets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.