Getting a good feature representation of data is paramount for Human Activity Recognition (HAR) using wearable sensors. An increasing number of feature learning approaches—in particular deep-learning based—have been proposed to extract an effective feature representation by analyzing large amounts of data. However, getting an objective interpretation of their performances faces two problems: the lack of a baseline evaluation setup, which makes a strict comparison between them impossible, and the insufficiency of implementation details, which can hinder their use. In this paper, we attempt to address both issues: we firstly propose an evaluation framework allowing a rigorous comparison of features extracted by different methods, and use it to carry out extensive experiments with state-of-the-art feature learning approaches. We then provide all the codes and implementation details to make both the reproduction of the results reported in this paper and the re-use of our framework easier for other researchers. Our studies carried out on the OPPORTUNITY and UniMiB-SHAR datasets highlight the effectiveness of hybrid deep-learning architectures involving convolutional and Long-Short-Term-Memory (LSTM) to obtain features characterising both short- and long-term time dependencies in the data.
The scarcity of labelled time-series data can hinder a proper training of deep learning models. This is especially relevant for the growing field of ubiquitous computing, where data coming from wearable devices have to be analysed using pattern recognition techniques to provide meaningful applications. To address this problem, we propose a transfer learning method based on attributing sensor modality labels to a large amount of time-series data collected from various application fields. Using these data, our method firstly trains a Deep Neural Network (DNN) that can learn general characteristics of time-series data, then transfers it to another DNN designed to solve a specific target problem. In addition, we propose a general architecture that can adapt the transferred DNN regardless of the sensors used in the target field making our approach in particular suitable for multichannel data. We test our method for two ubiquitous computing problems—Human Activity Recognition (HAR) and Emotion Recognition (ER)—and compare it a baseline training the DNN without using transfer learning. For HAR, we also introduce a new dataset, Cognitive Village-MSBand (CogAge), which contains data for 61 atomic activities acquired from three wearable devices (smartphone, smartwatch, and smartglasses). Our results show that our transfer learning approach outperforms the baseline for both HAR and ER.
This paper addresses wearable-based recognition of Activities of Daily Living (ADLs) which are composed of several repetitive and concurrent short movements having temporal dependencies. It is improbable to directly use sensor data to recognize these long-term composite activities because two examples (data sequences) of the same ADL result in largely diverse sensory data. However, they may be similar in terms of more semantic and meaningful short-term atomic actions. Therefore, we propose a two-level hierarchical model for recognition of ADLs. Firstly, atomic activities are detected and their probabilistic scores are generated at the lower level. Secondly, we deal with the temporal transitions of atomic activities using a temporal pooling method, rank pooling. This enables us to encode the ordering of probabilistic scores for atomic activities at the higher level of our model. Rank pooling leads to a 5–13% improvement in results as compared to the other popularly used techniques. We also produce a large dataset of 61 atomic and 7 composite activities for our experiments.
Abstract:With the recent spread of mobile devices equipped with different sensors, it is possible to continuously recognise and monitor activities in daily life. This sensor-based human activity recognition is formulated as sequence classification to categorise sequences of sensor values into appropriate activity classes. One crucial problem is how to model features that can precisely represent characteristics of each sequence and lead to accurate recognition. It is laborious and/or difficult to hand-craft such features based on prior knowledge and manual investigation about sensor data. To overcome this, we focus on a feature learning approach that extracts useful features from a large amount of data. In particular, we adopt a simple but effective one, called codebook approach, which groups numerous subsequences collected from sequences into clusters. Each cluster centre is called a codeword and represents a statistically distinctive subsequence. Then, a sequence is encoded as a feature expressing the distribution of codewords. The extensive experiments on different recognition tasks for physical, mental and eye-based activities validate the effectiveness, generality and usability of the codebook approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.