We present a novel approach for detecting social interactions in a crowded scene by employing solely visual cues. The detection of social interactions in unconstrained scenarios is a valuable and important task, especially for surveillance purposes. Our proposal is inspired by the social signaling literature, and in particular it considers the sociological notion of F-formation. An F-formation is a set of possible configurations in space that people may assume while participating in a social interaction. Our system takes as input the positions of the people in a scene and their (head) orientations; then, employing a voting strategy based on the Hough transform, it recognizes F-formations and the individuals associated with them. Experiments on simulations and real data promote our idea.
We demonstrate three-dimensional (3D) super-resolution live-cell imaging through thick specimens (50-150 μm), by coupling far-field individual molecule localization with selective plane illumination microscopy (SPIM). The improved signal-to-noise ratio of selective plane illumination allows nanometric localization of single molecules in thick scattering specimens without activating or exciting molecules outside the focal plane. We report 3D super-resolution imaging of cellular spheroids.
People re-identification is a fundamental operation for any multi-camera surveillance scenario. Until now, it has been performed by exploiting primarily appearance cues, hypothesizing that the individuals cannot change their clothes. In this paper, we relax this constraint by presenting a set of 3D soft-biometric cues, being insensitive to appearance variations, that are gathered using RGB-D technology. The joint use of these characteristics provides encouraging performances on a benchmark of 79 people, that have been captured in different days and with different clothing. This promotes a novel research direction for the re-identification community, supported also by the fact that a new brand of affordable RGB-D cameras have recently invaded the worldwide market.
Recent approaches on trajectory forecasting use tracklets to predict the future positions of pedestrians exploiting Long Short Term Memory (LSTM) architectures. This paper shows that adding vislets, that is, short sequences of head pose estimations, allows to increase significantly the trajectory forecasting performance. We then propose to use vislets in a novel framework called MX-LSTM, capturing the interplay between tracklets and vislets thanks to a joint unconstrained optimization of full covariance matrices during the LSTM backpropagation. At the same time, MX-LSTM predicts the future head poses, increasing the standard capabilities of the long-term trajectory forecasting approaches. With standard head pose estimators and an attentional-based social pooling, MX-LSTM scores the new trajectory forecasting state-of-the-art in all the considered datasets (Zara01, Zara02, UCY, and TownCentre) with a dramatic margin when the pedestrians slow down, a case where most of the forecasting approaches struggle to provide an accurate solution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.