A sensor system capable of automatically recognizing activities would allow many potential ubiquitous applications. In this paper, we present an easy to install sensor network and an accurate but inexpensive annotation method. A recorded dataset consisting of 28 days of sensor data and its annotation is described and made available to the community. Through a number of experiments we show how the hidden Markov model and conditional random fields perform in recognizing activities. We achieve a timeslice accuracy of 95.6% and a class accuracy of 79.4%.
An activity monitoring system allows many applications to assist in care giving for elderly in their homes. In this paper we present a wireless sensor network for unintrusive observations in the home and show the potential of generative and discriminative models for recognizing activities from such observations. Through a large number of experiments using four real world datasets we show the effectiveness of the generative hidden Markov model and the discriminative conditional random fields in activity recognition.
Early detection of high fall risk is an essential component of fall prevention in older adults. Wearable sensors can provide valuable insight into daily-life activities; biomechanical features extracted from such inertial data have been shown to be of added value for the assessment of fall risk. Body-worn sensors such as accelerometers can provide valuable insight into fall risk. Currently, biomechanical features derived from accelerometer data are used for the assessment of fall risk. Here, we studied whether deep learning methods from machine learning are suited to automatically derive features from raw accelerometer data that assess fall risk. We used an existing dataset of 296 older adults. We compared the performance of three deep learning model architectures (convolutional neural network (CNN), long short-term memory (LSTM) and a combination of these two (ConvLSTM)) to each other and to a baseline model with biomechanical features on the same dataset. The results show that the deep learning models in a single-task learning mode are strong in recognition of identity of the subject, but that these models only slightly outperform the baseline method on fall risk assessment. When using multi-task learning, with gender and age as auxiliary tasks, deep learning models perform better. We also found that preprocessing of the data resulted in the best performance (AUC = 0.75). We conclude that deep learning models, and in particular multi-task learning, effectively assess fall risk on the basis of wearable sensor data.
We present a novel probabilistic framework that fuses information coming from the audio and video modality to perform speaker diarization. The proposed framework is a Dynamic Bayesian Network (DBN) that is an extension of a factorial Hidden Markov Model (fHMM) and models the people appearing in an audiovisual recording as multimodal entities that generate observations in the audio stream, the video stream, and the joint audiovisual space. The framework is very robust to different contexts, makes no assumptions about the location of the recording equipment, and does not require labeled training data as it acquires the model parameters using the Expectation Maximization (EM) algorithm. We apply the proposed model to two meeting videos and a news broadcast video, all of which come from publicly available data sets. The results acquired in speaker diarization are in favor of the proposed multimodal framework, which outperforms the single modality analysis results and improves over the state-of-the-art audio-based speaker diarization.
Accurately recognizing human activities from sensor data recorded in a smart home setting is a challenging task. Typically, probabilistic models such as the hidden Markov model (HMM) or conditional random fields (CRF) are used to map the observed sensor data onto the hidden activity states. A weakness of these models, however, is that the type of distribution used to model state durations is fixed. Hidden semi-Markov models (HSMM) and semi-Markov conditional random fields (SMCRF) model duration explicitly, allowing state durations to be modelled accurately. In this paper we compare the recognition performance of these models on multiple fully annotated real world datasets consisting of several weeks of data. In our experiments the HSMM consistently outperforms the HMM, showing that accurate duration modelling can result in a significant increase in recognition performance. SMCRFs only slightly outperform CRFs, showing that CRFs are more robust in dealing with violations of the modelling assumptions. The datasets used in our experiments are made available to the community to allow further experimentation.
In this paper, we estimate different types of social actions from a single body-worn accelerometer in a crowded social setting. Accelerometers have many advantages in such settings: they are impervious to environmental noise, unobtrusive, cheap, low-powered, and their readings are specific to a single person. Our experiments show that they are surprisingly informative of different types of social actions. The social actions we address in this paper are whether a person is speaking, laughing, gesturing, drinking, or stepping. To our knowledge, this is the first work to carry out experiments on estimating social actions from conversational behavior using only a wearable accelerometer. The ability to estimate such actions using just the acceleration opens up the potential for analyzing more about social aspects of people's interactions without explicitly recording what they are saying.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.