Modern face alignment methods have become quite accurate at predicting the locations of facial landmarks, but they do not typically estimate the uncertainty of their predicted locations nor predict whether landmarks are visible. In this paper, we present a novel framework for jointly predicting landmark locations, associated uncertainties of these predicted locations, and landmark visibilities. We model these as mixed random variables and estimate them using a deep network trained with our proposed Location, Uncertainty, and Visibility Likelihood (LUVLi) loss. In addition, we release an entirely new labeling of a large face alignment dataset with over 19,000 face images in a full range of head poses. Each face is manually labeled with the ground-truth locations of 68 landmarks, with the additional information of whether each landmark is unoccluded, self-occluded (due to extreme head poses), or externally occluded. Not only does our joint estimation yield accurate estimates of the uncertainty of predicted landmark locations, but it also yields state-of-the-art estimates for the landmark locations themselves on multiple standard face alignment datasets. Our method's estimates of the uncertainty of predicted landmark locations could be used to automatically identify input images on which face alignment fails, which can be critical for downstream tasks.
Using a tactile array sensor to recognize an object often requires multiple touches at different positions. This process is prone to move or rotate the object, which inevitably increases difficulty in object recognition. To cope with the unknown object movement, this paper proposes a new tactile-SIFT descriptor to extract features in view of gradients in the tactile image to represent objects, to allow the features being invariant to object translation and rotation. The tactile-SIFT segments a tactile image into overlapping subpatches, each of which is represented using a dn-dimensional gradient vector, similar to the classic SIFT descriptor. Tactile-SIFT descriptors obtained from multiple touches form a dictionary of k words, and the bagof-words method is then used to identify objects. The proposed method has been validated by classifying 18 real objects with data from an off-the-shelf tactile sensor. The parameters of the tactile-SIFT descriptor, including the dimension size dn and the number of subpatches sp, are studied. It is found that the optimal performance is obtained using an 8-D descriptor with three subpatches, taking both the classification accuracy and time efficiency into consideration. By employing tactile-SIFT, a recognition rate of 91.33% has been achieved with a dictionary size of 50 clusters using only 15 touches.
Abstract-This paper presents a novel framework for integration of vision and tactile sensing by localizing tactile readings in a visual object map. Intuitively, there are some correspondences, e.g., prominent features, between visual and tactile object identification. To apply it in robotics, we propose to localize tactile readings in visual images by sharing same sets of feature descriptors through two sensing modalities. It is then treated as a probabilistic estimation problem solved in a framework of recursive Bayesian filtering. Feature-based measurement model and Gaussian based motion model are thus built. In our tests, a tactile array sensor is utilized to generate tactile images during interaction with objects and the results have proven the feasibility of our proposed framework.
Both head pose estimation and face alignment have been well studied in recent years given their wide application in human computer interaction, avatar animation, and face recognition/verification. However, even the most sophisticated face alignment methods also show some failures when they are applied on face images collected in the wild. In this paper, we show how face alignment can be improved by explicit head pose estimation. In summary, we make the following contributions:• We investigate the failure cases of several state of the art face alignment approaches and find that the head pose variation is a common issue across those methods. See Fig. 1.• Based on the above observation, we propose a ConvNet framework for explicit head pose estimation (Fig. 2). It is able to achieve an accuracy of 4 • absolute mean error of head pose estimation for face images acquired in unconstrained environment.• We propose two initialisation schemes based on reliable head pose estimation. They enable baseline face alignment method (RCPR [1]) perform better and reduce large head pose failures by 50% when using only one initialisation.
Abstract-Tactile data and kinesthetic cues are two important sensing sources in robot object recognition and are complementary to each other. In this paper, we propose a novel algorithm named Iterative Closest Labeled Point (iCLAP) to recognize objects using both tactile and kinesthetic information. The iCLAP first assigns different local tactile features with distinct label numbers. The label numbers of the tactile features together with their associated 3D positions form a 4D point cloud of the object. In this manner, the two sensing modalities are merged to form a synthesized perception of the touched object. To recognize an object, the partial 4D point cloud obtained from a number of touches iteratively matches with all the reference cloud models to identify the best fit. An extensive evaluation study with 20 real objects shows that our proposed iCLAP approach outperforms those using either of the separate sensing modalities, with a substantial recognition rate improvement of up to 18%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.