Automatic extraction of salient regions is beneficial for various computer vision applications, such as image segmentation and object recognition. The salient visual information across images is very useful and plays a significant role for the visually impaired in identifying tactile information. In this paper, we introduce a novel saliency cuts method using local adaptive thresholding to obtain four regions from a given saliency map. First, we produced four regions for image segmentation using a saliency map as an input image and local adaptive thresholding. Second, the four regions were used to initialize an iterative version of the GrabCuts algorithm and to produce a robust and high-quality binary mask with a full resolution. Finally, salient objects’ outer boundaries and inner edges were detected using the solution from our previous research. Experimental results showed that local adaptive thresholding using integral images can produce a more robust binary mask compared to the results from previous works that make use of global thresholding techniques for salient object segmentation. The proposed method can extract salient objects with a low-quality saliency map, achieving a promising performance compared to existing methods. The proposed method has advantages in extracting salient objects and generating simple, important edges from natural scene images efficiently for delivering visually salient information to the visually impaired.
Communication has been an important aspect of human life, civilization, and globalization for thousands of years. Biometric analysis, education, security, healthcare, and smart cities are only a few examples of speech recognition applications. Most studies have mainly concentrated on English, Spanish, Japanese, or Chinese, disregarding other low-resource languages, such as Uzbek, leaving their analysis open. In this paper, we propose an End-To-End Deep Neural Network-Hidden Markov Model speech recognition model and a hybrid Connectionist Temporal Classification (CTC)-attention network for the Uzbek language and its dialects. The proposed approach reduces training time and improves speech recognition accuracy by effectively using CTC objective function in attention model training. We evaluated the linguistic and lay-native speaker performances on the Uzbek language dataset, which was collected as a part of this study. Experimental results show that the proposed model achieved a word error rate of 14.3% using 207 h of recordings as an Uzbek language training dataset.
Current artificial intelligence systems for determining a person’s emotions rely heavily on lip and mouth movement and other facial features such as eyebrows, eyes, and the forehead. Furthermore, low-light images are typically classified incorrectly because of the dark region around the eyes and eyebrows. In this work, we propose a facial emotion recognition method for masked facial images using low-light image enhancement and feature analysis of the upper features of the face with a convolutional neural network. The proposed approach employs the AffectNet image dataset, which includes eight types of facial expressions and 420,299 images. Initially, the facial input image’s lower parts are covered behind a synthetic mask. Boundary and regional representation methods are used to indicate the head and upper features of the face. Secondly, we effectively adopt a facial landmark detection method-based feature extraction strategy using the partially covered masked face’s features. Finally, the features, the coordinates of the landmarks that have been identified, and the histograms of the oriented gradients are then incorporated into the classification procedure using a convolutional neural network. An experimental evaluation shows that the proposed method surpasses others by achieving an accuracy of 69.3% on the AffectNet dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.