Race recognition (RR), which has many applications such as in surveillance systems, image/video understanding, analysis, etc., is a difficult problem to solve completely. To contribute towards solving that problem, this article investigates using a deep learning model. An efficient Race Recognition Framework (RRF) is proposed that includes information collector (IC), face detection and preprocessing (FD&P), and RR modules. For the RR module, this study proposes two independent models. The first model is RR using a deep convolutional neural network (CNN) (the RR-CNN model). The second model (the RR-VGG model) is a fine-tuning model for RR based on VGG, the famous trained model for object recognition. In order to examine the performance of our proposed framework, we perform an experiment on our dataset named VNFaces, composed specifically of images collected from Facebook pages of Vietnamese people, to compare the accuracy between RR-CNN and RR-VGG. The experimental results show that for the VNFaces dataset, the RR-VGG model with augmented input images yields the best accuracy at 88.87% while RR-CNN, an independent and lightweight model, yields 88.64% accuracy. The extension experiments conducted prove that our proposed models could be applied to other race dataset problems such as Japanese, Chinese, or Brazilian with over 90% accuracy; the fine-tuning RR-VGG model achieved the best accuracy and is recommended for most scenarios.
Head pose estimation is an important sign in helping robots and other intelligence machines understand human. It plays a vital role in designing human computer interaction systems because many applications rely on precise results of head pose angles such as human behavior analysis, gaze estimation, 3D head reconstruction etc. This study presents a robust approach for estimating the head pose angles in a single image. More specifically, the proposed system first encodes the global features extracted from Histogram of Oriented Gradients in a multi stacked autoencoders neural network. Based on the hidden nodes in deep layers, Autoencoder has been proposed for feature reduction while maintaining the key information of data. A scalable gradient boosting machine is then employed to train and classify the embedded features. Experiences have evaluated on the Pointing 04 dataset and show that the proposed approach outperforms the state-of-the-art methods with the low head pose angle errors in pitch and yaw as 6.16 • and 7.17 • , respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.