In order to detect outliers in hydrological time series data for improving data quality and decision-making quality related to design, operation, and management of water resources, this research develops a time series outlier detection method for hydrologic data that can be used to identify data that deviate from historical patterns. The method first built a forecasting model on the history data and then used it to predict future values. Anomalies are assumed to take place if the observed values fall outside a given prediction confidence interval (PCI), which can be calculated by the predicted value and confidence coefficient. The use ofPCIas threshold is mainly on the fact that it considers the uncertainty in the data series parameters in the forecasting model to address the suitable threshold selection problem. The method performs fast, incremental evaluation of data as it becomes available, scales to large quantities of data, and requires no preclassification of anomalies. Experiments with different hydrologic real-world time series showed that the proposed methods are fast and correctly identify abnormal data and can be used for hydrologic time series analysis.
Ensuring the quality of hydrological data has become a key issue in the field of hydrology. Based on the characteristics of hydrological data, this paper proposes a data-driven quality control method for hydrological data. For continuous hydrological time series data, two combined forecasting models and one statistical control model are constructed from horizontal, vertical, and statistical perspectives and the three models provide three confidence intervals. Set the suspicious level based on the number of confidence intervals for data violations, control the data, and provide suggested values for suspicious and missing data. For the discrete hydrological data with large time-space difference, the similar weight topological map between the neighboring stations is established centering on the hydrological station under the test and it is adjusted continuously with the seasonal changes. Lastly, a spatial interpolation model is established to detect the data. The experimental results show that the quality control method proposed in this paper can effectively detect and control the data, find suspicious and erroneous data, and provide suggested values.
Symbolic Aggregate approximation (SAX) is a classical symbolic approach in many time series data mining applications. However, SAX only reflects the segment mean value feature and misses important information in a segment, namely the trend of the value change in the segment. Such a miss may cause a wrong classification in some cases, since the SAX representation cannot distinguish different time series with similar average values but different trends. In this paper, we present Trend Feature Symbolic Aggregate approximation (TFSAX) to solve this problem. First, we utilize Piecewise Aggregate Approximation (PAA) approach to reduce dimensionality and discretize the mean value of each segment by SAX. Second, extract trend feature in each segment by using trend distance factor and trend shape factor. Then, design multi-resolution symbolic mapping rules to discretize trend information into symbols. We also propose a modified distance measure by integrating the SAX distance with a weighted trend distance. We show that our distance measure has a tighter lower bound to the Euclidean distance than that of the original SAX. The experimental results on diverse time series data sets demonstrate that our proposed representation significantly outperforms the original SAX representation and an improved SAX representation for classification.
The accurate and timely estimation of river discharge plays an important role in hydrological modeling, especially for avoiding the consequences of flood events. The majority of existing work on hydrologic prediction focuses on modeling the inherent physical process for specific river basins, while the geographic-connections between rivers are largely ignored. Geographically connected rivers provide rich spatial information that can be used to predict discharge amounts. In this paper, we study a novel problem of exploiting both temporal patterns and spatial connections for hydrological prediction. We construct three relationship graphs for hydrological gauges in the study area: the hydraulic distance graph, the Euclidean distance graph and the correlation graph. We fuse these graphs into one hydrological network graph, and propose a novel framework ST-Hydro which exploits Graph Convolutional Networks (GCN) for learning the spatial feature representations, and Recurrent Neural Networks with carefully designed activation functions for capturing temporal features simultaneously for hydrological prediction. Experimental results on real world data set demonstrate that the proposed framework can predict the river discharge effectively and at an early stage. INDEX TERMS Hydrologic prediction, spatial and temporal modeling, graph convolutional networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.