Human activity recognition (HAR) has become a popular topic in research because of its wide application. With the development of deep learning, new ideas have appeared to address HAR problems. Here, a deep network architecture using residual bidirectional long short-term memory (LSTM) cells is proposed. The advantages of the new network include that a bidirectional connection can concatenate the positive time direction (forward state) and the negative time direction (backward state). Second, residual connections between stacked cells act as highways for gradients, which can pass underlying information directly to the upper layer, effectively avoiding the gradient vanishing problem. Generally, the proposed network shows improvements on both the temporal (using bidirectional cells) and the spatial (residual connections stacked deeply) dimensions, aiming to enhance the recognition rate.When tested with the Opportunity data set and the public domain UCI data set, the accuracy was increased by 4.78% and 3.68%, respectively, compared with previously reported results. Finally, the confusion matrix of the public domain UCI data set was analyzed.
IntroductionIn real life, many problems can be described as time series problems. Indeed, human activity recognition (HAR) is of value in both theoretical research and actual practice. It can be used widely, including in health monitoring [1][2], smart homes [3][4], and human-computer interactions [5][6]; for example, LSTM cells are a good choice for solving HAR problems. Unlike traditional algorithms, LSTM can catch relationships in data on the temporal dimension without having to mix the time steps together as a 1D convolutional neural network (CNN) would do. As more of what is commonly called "big data" emerges, LSTM architecture can offer great performance and many potential applications. More specifically, HAR is the process of obtaining action data with sensors; it symbolizes the action information and then allows understanding and extraction of the motion characteristics, which is what activity recognition refers to. Because of the spatial complexity and temporal divergence of behavior, there is no unified recognition method. A public domain benchmark of HAR has been introduced, and different methods of recognition have been analyzed [7]. The results showed that the K-Nearest Neighbor (KNN) algorithm outperforms other algorithms in most recognition tasks. Support Vector Machine (SVM) is another outstanding algorithm. A Multi-Class Hardware-Friendly Support Vector Machine (MC-HF-SVM), which uses fixed-point arithmetic for HAR instead of the typical floating-point arithmetic, has been proposed for sensor data [8]. Unlike the manual filtering features in previous algorithms, a systematic feature learning method that combines feature extraction with CNN training has also been proposed [9]. Subsequently, DeepConvLSTM networks outperformed previous algorithms in the Opportunity Challenge by an average of 4% of the F1 score [10]; the effects of parameters on the final result were...