Automatic affect analysis has attracted great interest in various contexts including the recognition of action units and basic or non-basic emotions. In spite of major efforts, there are several open questions on what the important cues to interpret facial expressions are and how to encode them. In this paper, we review the progress across a range of affect recognition applications to shed light on these fundamental questions. We analyse the state-of-the-art solutions by decomposing their pipelines into fundamental components, namely face registration, representation, dimensionality reduction and recognition. We discuss the role of these components and highlight the models and new trends that are followed in their design. Moreover, we provide a comprehensive analysis of facial representations by uncovering their advantages and limitations; we elaborate on the type of information they encode and discuss how they deal with the key challenges of illumination variations, registration errors, head-pose variations, occlusions, and identity bias. This survey allows us to identify open issues and to define future directions for designing real-world affect recognition systems.
As an instance-level recognition problem, person reidentification (ReID) relies on discriminative features, which not only capture different spatial scales but also encapsulate an arbitrary combination of multiple scales. We callse features of both homogeneous and heterogeneous scales omni-scale features. In this paper, a novel deep ReID CNN is designed, termed Omni-Scale Network (OSNet), for omni-scale feature learning. This is achieved by designing a residual block composed of multiple convolutional feature streams, each detecting features at a certain scale. Importantly, a novel unified aggregation gate is introduced to dynamically fuse multiscale features with input-dependent channel-wise weights. To efficiently learn spatial-channel correlations and avoid overfitting, the building block uses both pointwise and depthwise convolutions. By stacking such blocks layerby-layer, our OSNet is extremely lightweight and can be trained from scratch on existing ReID benchmarks. Despite its small model size, our OSNet achieves state-ofthe-art performance on six person-ReID datasets. Code and models are available at: https://github.com/ KaiyangZhou/deep-person-reid.
Shadows are integral parts of natural scenes and one of the elements contributing to naturalness of synthetic scenes. In many image analysis and interpretation applications, shadows interfere with fundamental tasks such as object extraction and description. For this reason, shadow segmentation is an important step in image analysis. In this paper, we propose a new cast shadow segmentation algorithm for both still and moving images. The proposed technique exploits spectral and geometrical properties of shadows in a scene to perform this task. The presence of a shadow is first hypothesized with an initial and simple evidence based on the fact that shadows darken the surface which they are cast upon. The validity of detected regions as shadows is further verified by making use of more complex hypotheses on color invariance and geometric properties of shadows. Finally, an information integration stage confirms or rejects the initial hypothesis for every detected region. Simulation results show that the proposed algorithm is robust and efficient in detecting shadows for a large class of scenes.
Abstract-We propose a filtering framework for multi-target tracking that is based on the Probability Hypothesis Density (PHD) filter and data association using graph matching. This framework can be combined with any object detectors that generate positional and dimensional information of objects of interest. The PHD filter compensates for missing detections and removes noise and clutter. Moreover, this filter reduces the growth in complexity with the number of targets from exponential to linear by propagating the first-order moment of the multitarget posterior, instead of the full posterior. In order to account for the nature of the PHD propagation, we propose a novel particle resampling strategy and we adapt the dynamic and observation models to cope with varying object scales. The proposed resampling strategy allows us to use the PHD filter when a priori knowledge of the scene is not available. Moreover, the dynamic and observation models are not limited to the PHD filter and can be applied to any Bayesian tracker that can handle State Dependent Variances (SDV). Extensive experimental results on a large standard video surveillance dataset using a standard evaluation protocol show that the proposed filtering framework improves the accuracy of the tracker, especially in cluttered scenes.
There is growing concern about how personal data are used when users grant applications direct access to the sensors of their mobile devices. In fact, high resolution temporal data generated by motion sensors reflect directly the activities of a user and indirectly physical and demographic attributes. In this paper, we propose a feature learning architecture for mobile devices that provides flexible and negotiable privacy-preserving sensor data transmission by appropriately transforming raw sensor data. The objective is to move from the current binary setting of granting or not permission to an application, toward a model that allows users to grant each application permission over a limited range of inferences according to the provided services. The internal structure of each component of the proposed architecture can be flexibly changed and the trade-off between privacy and utility can be negotiated between the constraints of the user and the underlying application. We validated the proposed architecture in an activity recognition application using two real-world datasets, with the objective of recognizing an activity without disclosing gender as an example of private information. Results show that the proposed framework maintains the usefulness of the transformed data for activity recognition, with an average loss of only around three percentage points, while reducing the possibility of gender classification to around 50%, the target random guess, from more than 90% when using raw sensor data. We also present and distribute MotionSense, a new dataset for activity and attribute recognition collected from motion sensors.
The deployment of multiple robots for achieving a common goal helps to improve the performance, efficiency, and/or robustness in a variety of tasks. In particular, the observation of moving targets is an important multirobot application that still exhibits numerous open challenges, including the effective coordination of the robots. This paper reviews control techniques for cooperative mobile robots monitoring multiple targets. The simultaneous movement of robots and targets makes this problem particularly interesting, and our review systematically addresses this cooperative multirobot problem for the first time. We classify and critically discuss the control techniques: cooperative multirobot observation of multiple moving targets, cooperative search, acquisition, and track, cooperative tracking, and multirobot pursuit evasion. We also identify the five major elements that characterize this problem, namely, the coordination method, the environment, the target, the robot and its sensor(s). These elements are used to systematically analyze the control techniques. The majority of the studied work is based on simulation and laboratory studies, which may not accurately reflect real-world operational conditions. Importantly, while our systematic analysis is focused on multitarget observation, our proposed classification is useful also for related multirobot applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.