Monitoring complex industrial plants is a very important task in order to ensure the management, reliability, safety and maintenance of the desired product quality. Early detection of abnormal events allows actions to prevent more serious consequences, improve the system's performance and reduce manufacturing costs. In this work, a new methodology for fault detection is introduced, based on time series models and statistical process control (MSPC). The proposal explicitly accounts for both dynamic and non-linearity properties of the system. A dynamic feature selection is carried out to interpret the dynamic relations by characterizing the auto-and cross-correlations for every variable. After that, a time-series based model framework is used to obtain and validate the best descriptive model of the plant (either linear o non-linear). Fault detection is based on finding anomalies in the temporal residual signals obtained from the models by univariate and multivariate statistical process control charts. Finally, the performance of the method is validated on two benchmarks, a wastewater treatment plant and the Tennessee Eastman Plant. A comparison with other classical methods clearly demonstrates the over performance and feasibility of the proposed monitoring scheme.
Time series data are becoming increasingly important due to the interconnectedness of the world. Classical problems, which are getting bigger and bigger, require more and more resources for their processing, and Big Data technologies offer many solutions. Although the principal algorithms for traditional vector-based problems are available in Big Data environments, the lack of tools for time series processing in these environments needs to be addressed. In this work, we propose a scalable and distributed time series transformation for Big Data environments based on well-known time series features (SCMFTS), which allows practitioners to apply traditional vector-based algorithms to time series problems. The proposed transformation, along with the algorithms available in Spark, improved the best results in the state-of-the-art on the Wearable Stress and Affect Detection dataset, which is the biggest publicly available multivariate time series dataset in the University of California Irvine (UCI) Machine Learning Repository. In addition, SCMFTS showed a linear relationship between its runtime and the number of processed time series, demonstrating a linear scalable behavior, which is mandatory in Big Data environments. SCMFTS has been implemented in the Scala programming language for the Apache Spark framework, and the code is publicly available.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.