It is challenging to track multiple facial features simultaneously when rich expressions are presented on a face. We propose a two-step solution. In the first step, several independent condensation-style particle filters are utilized to track each facial feature in the temporal domain. Particle filters are very effective for visual tracking problems; however multiple independent trackers ignore the spatial constraints and the natural relationships among facial features. In the second step, we use Bayesian inference—belief propagation—to infer each facial feature's contour in the spatial domain, in which we learn the relationships among contours of facial features beforehand with the help of a large facial expression database. The experimental results show that our algorithm can robustly track multiple facial features simultaneously, while there are large interframe motions with expression changes.
Given a person's neutral face, we can predict his/her unseen expression by machine learning techniques for image processing. Different from the prior expression cloning or image analogy approaches, we try to hallucinate the person's plausible facial expression with the help of a large face expression database. In the first step, regularization network based nonlinear manifold learning is used to obtain a smooth estimation for unseen facial expression, which is better than the reconstruction results of PCA. In the second step, Markov network is adopted to learn the low-level local facial feature's relationship between the residual neutral and the expressional face image's patches in the training set, then belief propagation is employed to infer the expressional residual face image for that person. By integrating the two approaches, we obtain the final results. The experimental results show that the hallucinated facial expression is not only expressive but also close to the ground truth.
It is challenging to track multiple facial features simultaneously in video while rich facial expressions are presented in a human face. To accurately predict the positions of multiple facial features' contours is important and difficult. This paper proposes a multi-cue prediction model based tracking algorithm. In the prediction model, CAMSHIFT is used to track the face in video in advance, and facial features' spatial constraint is utilized to roughly obtain the positions of facial features. Second order autoregressive process (ARP) based dynamic model is combined with graphical model (Bayesian network) based dynamic model. Incorporating ARP's quickness into graphical model's accurateness, we obtain the fusion of the prediction. Finally the prediction model and the measurement model are integrated into the framework of Kalman filter. The experimental results show that our algorithm can accurately track multiple facial features with varied facial expressions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.