Understanding and improving recurrent networks for human activity recognition by continuous attention

Zhang, Ming; Gao, Haoxiang; Yu, Tong; Mengshoel, Ole J.; Langseth, Helge; Lane, Ian; Liu, Xiaobing

doi:10.1145/3267242.3267286

Cited by 153 publications

(100 citation statements)

References 27 publications

Supporting

Mentioning

100

Contrasting

Order By: Relevance

“…The Daphnet data was recorded during various walking tasks of 10 different participants and have three different annotations: 1) transient activities (which are discarded here), 2) freezing of gait, and 3) normal movements. The data were recorded with a sampling rate of 64Hz, however, we downsample it to 32Hz by decimation and discard the transient activities, following [40].…”

Section: ) Daphnetmentioning

confidence: 99%

“…We choose a sliding window of approximately 2.5 seconds for the HARD, HARD2 and HARD3 datasets, with 50% of overlapping. Following other works [8], [26], [40], the PAMAP2 and Daphnet datasets have, respectively, a window size of approximately In these networks, the second FC layer contains the same number of neurons as the number of classes and is followed by the softmax function. The convolutional kernel for all CNN layers was set to 3 × 3, whereas the max-pooling kernel size was 2 × 2.…”

Section: B Data Pre-processingmentioning

confidence: 99%

“…Subjects 5 and 6 form the validation and test sets, respectively, for the PAMAP2 dataset. This is a common choice of subjects in the literature [7], [8], [26], [37], [40]. Decimation was used as the downsampling method.…”

Section: ) the Effect Of The Sampling Ratementioning

confidence: 99%

See 2 more Smart Citations

Improving Cross-Subject Activity Recognition via Adversarial Learning

Leite

Xiao

2020

IEEE Access

View full text Add to dashboard Cite

Deep learning has been widely used for implementing human activity recognition from wearable sensors like inertial measurement units. The performance of deep activity recognition is heavily affected by the amount and variability of the labeled data available for training the deep learning models. On the other hand, it is costly and time-consuming to collect and label data. Given limited training data, it is hard to maintain high performance across a wide range of subjects, due to the differences in the underlying data distribution of the training and the testing sets. In this work, we develop a novel solution that applies adversarial learning to improve cross-subject performance by generating training data that mimic artificial subjects-i.e. through data augmentation-and enforcing the activity classifier to ignore subject-dependent information. Contrary to domain adaptation methods, our solution does not utilize any data from subjects of the test set (or target domain). Furthermore, our solution is versatile as it can be utilized together with any deep neural network as the classifier. Considering the open dataset PAMAP2, nearly 10% higher crosssubject performance in terms of F1-score can be achieved when training a CNN-LSTM-based classifier with our solution. A performance gain of 5% is also observed when our solution is applied to a stateof-the-art HAR classifier composed of a combination of inception neural network and recurrent neural network. We also investigate different influencing factors of classification performance (i.e. selection of sensor modalities, sampling rates and the number of subjects in the training data), and summarize a practical guideline for implementing deep learning solutions for sensor-based human activity recognition.

show abstract

Section: ) Daphnetmentioning

confidence: 99%

Section: B Data Pre-processingmentioning

confidence: 99%

See 1 more Smart Citation

Improving Cross-Subject Activity Recognition via Adversarial Learning

Leite

Xiao

2020

IEEE Access

View full text Add to dashboard Cite

show abstract

“…Targeting the spatial and temporal subset selection within such data, attention mechanisms have been recently explored to improve HAR performances on MoCap data. To enable understanding of the relevance of each sensor in such scenario, Zeng et al [19] proposed an attentionbased LSTM framework, where a sensor attention module was used at the input level and for each timestep, with an additional temporal attention module at a later layer. Their sensor attention module was implemented with input from different sensors at single timestep, while temporal attention was computed based on the output of the LSTM layer.…”

Section: B Attention Mechanism Adapted For Har On Mocap Datamentioning

confidence: 99%

“…We also compare with a variant of the BANet with a fullyconnected layer used in the temporal attention computation (BANet-dense) instead of a 1 × 1 convolution layer. In addition, we compare our work with the approach used in related HAR studies [19] [20] [21], where the sensor attention was computed before the extraction of temporal information. As such, we create a variant (BANet-compat, for BANet compatibility version) where the computation of body attention was done at input level instead of at feature fusion level with the same attention algorithms presented in last section.…”

Section: A Comparison Experimentsmentioning

confidence: 99%

Learning Temporal and Bodily Attention in Protective Movement Behavior Detection

Wang

Peng

Olugbade

et al. 2019

2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)

View full text Add to dashboard Cite

Fig. 1. The overview of the Body Attention Network. Each body part is described by the joint angle plus energy. Data collected from feet were noisy and hence not used in this work.Abstract-For people with chronic pain, the assessment of protective behavior during physical functioning is essential to understand their subjective pain-related experiences (e.g., fear and anxiety toward pain and injury) and how they deal with such experiences (avoidance or reliance on specific body joints), with the ultimate goal of guiding intervention. Advances in deep learning (DL) can enable the development of such intervention. Using the EmoPain MoCap dataset, we investigate how attention-based DL architectures can be used to improve the detection of protective behavior by capturing the most informative temporal and body configurational cues characterizing specific movements and the strategies used to perform them. We propose an end-to-end deep learning architecture named BodyAttentionNet (BANet). BANet is designed to learn temporal and bodily parts that are more informative to the detection of protective behavior. The approach addresses the variety of ways people execute a movement (including healthy people) independently of the type of movement analyzed.Through extensive comparison experiments with other state-of-the-art machine learning techniques used with motion capture data, we show statistically significant improvements achieved by using these attention mechanisms. In addition, the BANet architecture requires a much lower number of parameters than the state of the art for comparable if not higher performances.

show abstract