International audienceThe PASCAL Visual Object Classes Challenge ran from February to March 2005. The goal of the challenge was to recognize objects from a number of visual object classes in realistic scenes (i.e. not pre-segmented objects). Four object classes were selected: motorbikes, bicycles, cars and people. Twelve teams entered the challenge. In this chapter we provide details of the datasets, algorithms used by the teams, evaluation criteria, and results achieved
Pain-related emotions are a major barrier to effective self rehabilitation in chronic pain. Automated coaching systems capable of detecting these emotions are a potential solution. This paper lays the foundation for the development of such systems by making three contributions. First, through literature reviews, an overview of how pain is expressed in chronic pain and the motivation for detecting it in physical rehabilitation is provided. Second, a fully labelled multimodal dataset (named ‘EmoPain’) containing high resolution multiple-view face videos, head mounted and room audio signals, full body 3D motion capture and electromyographic signals from back muscles is supplied. Natural unconstrained pain related facial expressions and body movement behaviours were elicited from people with chronic pain carrying out physical exercises. Both instructed and non-instructed exercises were considered to reflect traditional scenarios of physiotherapist directed therapy and home-based self-directed therapy. Two sets of labels were assigned: level of pain from facial expressions annotated by eight raters and the occurrence of six pain-related body behaviours segmented by four experts. Third, through exploratory experiments grounded in the data, the factors and challenges in the automated recognition of such expressions and behaviour are described, the paper concludes by discussing potential avenues in the context of these findings also highlighting differences for the two exercise scenarios addressed.
As fuzzy c-means clustering (FCM) algorithm is sensitive to noise, local spatial information is often introduced to an objective function to improve the robustness of the FCM algorithm for image segmentation. However, the introduction of local spatial information often leads to a high computational complexity, arising out of an iterative calculation of the distance between pixels within local spatial neighbors and clustering centers. To address this issue, an improved FCM algorithm based on morphological reconstruction and membership filtering (FRFCM) that is significantly faster and more robust than FCM is proposed in this paper. First, the local spatial information of images is incorporated into FRFCM by introducing morphological reconstruction operation to guarantee noise-immunity and image detail-preservation. Second, the modification of membership partition, based on the distance between pixels within local spatial neighbors and clustering centers, is replaced by local membership filtering that depends only on the spatial neighbors of membership partition. Compared with stateof-the-art algorithms, the proposed FRFCM algorithm is simpler and significantly faster, since it is unnecessary to compute the distance between pixels within local spatial neighbors and clustering centers. In addition, it is efficient for noisy image segmentation because membership filtering are able to improve membership partition matrix efficiently. Experiments performed on synthetic and real-world images demonstrate that the proposed algorithm
A great number of improved fuzzy c-means (FCM) clustering algorithms have been widely used for grayscale and color image segmentation. However, most of them are timeconsuming and unable to provide desired segmentation results for color images due to two reasons. The first one is that the incorporation of local spatial information often causes a high computational complexity due to the repeated distance computation between clustering centers and pixels within a local neighboring window. The other one is that a regular neighboring window usually breaks up the real local spatial structure of images and thus leads to a poor segmentation. In this work, we propose a superpixel-based fast FCM clustering algorithm (SFFCM) that is significantly faster and more robust than stateof-the-art clustering algorithms for color image segmentation. To obtain better local spatial neighborhoods, we firstly define a multiscale morphological gradient reconstruction (MMGR) operation to obtain a superpixel image with accurate contour. In contrast to traditional neighboring window of fixed size and shape, the superpixel image provides better adaptive and irregular local spatial neighborhoods that are helpful for improving color image segmentation. Secondly, based on the obtained superpixel image, the original color image is simplified efficiently and its histogram is computed easily by counting the number of pixels in each region of the superpixel image. Finally, we implement FCM with histogram parameter on the superpixel image to obtain the final segmentation result. Experiments performed on synthetic images and real images demonstrate that the proposed algorithm provides better segmentation results and takes less time than state-of-the-art clustering algorithms for color image segmentation.
The increasing number of people playing games on touch-screen mobile phones raises the question of whether touch behaviors reflect players' emotional states. This prospect would not only be a valuable evaluation indicator for game designers, but also for real-time personalization of the game experience. Psychology studies on acted touch behavior show the existence of discriminative affective profiles. In this article, finger-stroke features during gameplay on an iPod were extracted and their discriminative power analyzed. Machine learning algorithms were used to build systems for automatically discriminating between four emotional states (Excited, Relaxed, Frustrated, Bored), two levels of arousal and two levels of valence. Accuracy reached between 69% and 77% for the four emotional states, and higher results (∼89%) were obtained for discriminating between two levels of arousal and two levels of valence. We conclude by discussing the factors relevant to the generalization of the results to applications other than games.
Deep learning has been widely used for medical image segmentation and a large number of papers has been presented recording the success of deep learning in the field. A comprehensive thematic survey on medical image segmentation using deep learning techniques is presented. This paper makes two original contributions. Firstly, compared to traditional surveys that directly divide literatures of deep learning on medical image segmentation into many groups and introduce literatures in detail for each group, we classify currently popular literatures according to a multi‐level structure from coarse to fine. Secondly, this paper focuses on supervised and weakly supervised learning approaches, without including unsupervised approaches since they have been introduced in many old surveys and they are not popular currently. For supervised learning approaches, we analyse literatures in three aspects: the selection of backbone networks, the design of network blocks, and the improvement of loss functions. For weakly supervised learning approaches, we investigate literature according to data augmentation, transfer learning, and interactive segmentation, separately. Compared to existing surveys, this survey classifies the literatures very differently from before and is more convenient for readers to understand the relevant rationale and will guide them to think of appropriate improvements in medical image segmentation based on deep learning approaches.
A human being's cognitive system can be simulated by artificial intelligent systems. Machines and robots equipped with cognitive capability can automatically recognize a humans mental state through their gestures and facial expressions. In this paper, an artificial intelligent system is proposed to monitor depression. It can predict the scales of Beck depression inventory II (BDI-II) from vocal and visual expressions. First, different visual features are extracted from facial expression images. Deep learning method is utilized to extract key visual features from the facial expression frames. Second, spectral lowlevel descriptors and mel-frequency cepstral coefficients features are extracted from short audio segments to capture the vocal expressions. Third, feature dynamic history histogram (FDHH) is proposed to capture the temporal movement on the feature space. Finally, these FDHH and audio features are fused using regression techniques for the prediction of the BDI-II scales. The proposed method has been tested on the public Audio/Visual Emotion Challenges 2014 dataset as it is tuned to be more focused on the study of depression. The results outperform all the other existing methods on the same dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.