s u m m a r yThe skill of a forecast can be assessed by comparing the relative proximity of both the forecast and a benchmark to the observations. Example benchmarks include climatology or a naïve forecast. Hydrological ensemble prediction systems (HEPS) are currently transforming the hydrological forecasting environment but in this new field there is little information to guide researchers and operational forecasters on how benchmarks can be best used to evaluate their probabilistic forecasts. In this study, it is identified that the forecast skill calculated can vary depending on the benchmark selected and that the selection of a benchmark for determining forecasting system skill is sensitive to a number of hydrological and system factors. A benchmark intercomparison experiment is then undertaken using the continuous ranked probability score (CRPS), a reference forecasting system and a suite of 23 different methods to derive benchmarks. The benchmarks are assessed within the operational set-up of the European Flood Awareness System (EFAS) to determine those that are 'toughest to beat' and so give the most robust discrimination of forecast skill, particularly for the spatial average fields that EFAS relies upon.Evaluating against an observed discharge proxy the benchmark that has most utility for EFAS and avoids the most naïve skill across different hydrological situations is found to be meteorological persistency. This benchmark uses the latest meteorological observations of precipitation and temperature to drive the hydrological model. Hydrological long term average benchmarks, which are currently used in EFAS, are very easily beaten by the forecasting system and the use of these produces much naïve skill. When decomposed into seasons, the advanced meteorological benchmarks, which make use of meteorological observations from the past 20 years at the same calendar date, have the most skill discrimination. They are also good at discriminating skill in low flows and for all catchment sizes. Simpler meteorological benchmarks are particularly useful for high flows. Recommendations for EFAS are to move to routine use of meteorological persistency, an advanced meteorological benchmark and a simple meteorological benchmark in order to provide a robust evaluation of forecast skill. This work provides the first comprehensive evidence on how benchmarks can be used in evaluation of skill in probabilistic hydrological forecasts and which benchmarks are most useful for skill discrimination and avoidance of naïve skill in a large scale HEPS. It is recommended that all HEPS use the evidence and methodology provided here to evaluate which benchmarks to employ; so forecasters can have trust in their skill evaluation and will have confidence that their forecasts are indeed better.
This study applies statistical postprocessing to ensemble forecasts of near‐surface temperature, 24 h precipitation totals, and near‐surface wind speed from the global model of the European Centre for Medium‐Range Weather Forecasts (ECMWF). The main objective is to evaluate the evolution of the difference in skill between the raw ensemble and the postprocessed forecasts. Reliability and sharpness, and hence skill, of the former is expected to improve over time. Thus, the gain by postprocessing is expected to decrease. Based on ECMWF forecasts from January 2002 to March 2014 and corresponding observations from globally distributed stations, we generate postprocessed forecasts by ensemble model output statistics (EMOS) for each station and variable. Given the higher average skill of the postprocessed forecasts, we analyze the evolution of the difference in skill between raw ensemble and EMOS. This skill gap remains almost constant over time indicating that postprocessing will keep adding skill in the foreseeable future.
[1] River discharge predictions often show errors that degrade the quality of forecasts. Three different methods of error correction are compared, namely, an autoregressive model with and without exogenous input (ARX and AR, respectively), and a method based on wavelet transforms. For the wavelet method, a Vector-Autoregressive model with exogenous input (VARX) is simultaneously fitted for the different levels of wavelet decomposition; after predicting the next time steps for each scale, a reconstruction formula is applied to transform the predictions in the wavelet domain back to the original time domain. The error correction methods are combined with the Hydrological Uncertainty Processor (HUP) in order to estimate the predictive conditional distribution. For three stations along the Danube catchment, and using output from the European Flood Alert System (EFAS), we demonstrate that the method based on wavelets outperforms simpler methods and uncorrected predictions with respect to mean absolute error, Nash-Sutcliffe efficiency coefficient (and its decomposed performance criteria), informativeness score, and in particular forecast reliability. The wavelet approach efficiently accounts for forecast errors with scale properties of unknown source and statistical structure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.