We propose a novel data-driven feature extraction approach based on direct causality and fuzzy temporal windows (FTWs) to improve the precision of human activity recognition and mitigate the problems of easily-confused activities and unlabeled data, which significantly degrade classification performance owing to the correlation of labeled data. In recognizing activities, the proposed approach not only considers the importance of oncoming short-term sensor data but also considers the continuity from past activities of the preceding long-term sensor data. In terms of the oncoming data, the causality feature is extracted using the direct transfer entropy to determine the unique pattern of an activity, which represents the quantified causal relationship between sensor activations. In terms of the preceding data, several hours of historical data are compressed to fuzzy features based on FTWs. Subsequently, the causality and fuzzy features are fused by matrix multiplication to express distinct features of activities. To effectively learn the spatiotemporal dependencies of the fused feature, deep long short-term memory (LSTM), two-dimensional convolutional neural network (2D-CNN), and hybrid models composed of a combination of LSTM and CNN were used. Leave-one-day-out cross-validation was performed based on the CASAS open datasets, including Aruba, Cairo, and Milan. The results showed that the macro-F1-scores were improved by 16.4, 37.5, and 18.5%, respectively, compared with those of the FTW-only environments. In addition, the proposed approach could improve the precision of activity recognition and mitigate the problems associated with the environments containing unlabeled data.INDEX TERMS activity of daily living, activity recognition, causality, convolutional neural network, deep learning, fuzzy feature, long short-term memory.
Residential electricity load data can include numerous types of bad data, even clustered bad data, as they that are typically captured by simple measurement instruments. For example, in the case of a time-series of Not-a-Number (NaN) errors, the values before or next to a NaN may appear as the sum of actual values during the times of the NaN series. To utilize load data that includes such erroneous data for prediction or data mining analysis, customized detection and imputation should be conducted. This study proposes a new joint detection and imputation method for handling clustered bad data in residential electricity loads. Examples of these data are known invalid data points, such as consecutive NaN or zero values followed by or being ahead of an outlier. The proposed joint detection and imputation scheme first investigates the neighbors of the invalid data points, using probabilistic forecasting techniques. These techniques are implemented by the next valid neighbors to determine whether there is an anomaly or not. Then, adaptive imputations are applied on the basis of the detection, the candidate point should be imputed simultaneously or not. To assess the potential of the newly proposed scheme to characterize the clustered bad data, we analyzed the electricity loads of 354 households. Moreover, joint detection and imputations are conducted to test with the randomly injected synthesized clustered bad data (containing NaNs of various lengths) that is followed by the summation of the actual NaN values. The proposed scheme succeeded in detecting clustered bad data with an accuracy of 95.5% and a false alarm rate of 3.6% for all households in the dataset. Outlier detection-assisted imputation schemes are evaluated for NaNs with optional outliers. Results demonstrate that these schemes improve the overall accuracy significantly compared to schemes without outlier detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.