SUMMARYA systematic study is performed of a number of scores that can be used for objective validation of probabilistic prediction of scalar variables: Rank Histograms, Discrete and Continuous Ranked Probability Scores (DRPS and CRPS, respectively). The reliability-resolution-uncertainty decomposition, defined by Murphy for the DRPS, and extended here to the CRPS, is studied in detail. The decomposition is applied to the results of the Ensemble Prediction Systems of the European Centre for Medium-range Weather Forecasts and the National Centers for Environmental Prediction. Comparison is made with the decomposition of the CRPS defined by Hersbach. The possibility of determining an accurate reliability-resolution decomposition of the RPSs is severely limited by the unavoidably (relatively) small number of available realizations of the prediction system. The Hersbach decomposition may be an appropriate compromise between the competing needs for accuracy and practical computability.
A verification system has been developed for the ensemble prediction system (EPS) at the Canadian Meteorological Centre (CMC). This provides objective criteria for comparing two EPSs, necessary when deciding whether or not to implement a new or revised EPS. The proposed verification methodology is based on the continuous ranked probability score (CRPS), which provides an evaluation of the global skill of an EPS. Its reliability/resolution partition, proposed by Hersbach, is used to measure the two main attributes of a probabilistic system. Also, the characteristics of the reliability are obtained from the two first moments of the reduced centered random variable (RCRV), which define the bias and the dispersion of an EPS. Resampling bootstrap techniques have been applied to these scores. Confidence intervals are thus defined, expressing the uncertainty due to the finiteness of the number of realizations used to compute the scores. All verifications are performed against observations to provide more independent validations and to avoid any local systematic bias of an analysis. A revised EPS, which has been tested at the CMC in a parallel run during the autumn of 2005, is described in this paper. This EPS has been compared with the previously operational one with the verification system presented above. To illustrate the verification methodology, results are shown for the temperature at 850 hPa. The confidence intervals are computed by taking into account the spatial correlation of the data and the temporal autocorrelation of the forecast error. The revised EPS performs significantly better for all the forecast ranges, except for the resolution component of the CRPS where the improvement is no longer significant from day 7. The significant improvement of the reliability is mainly due to a better dispersion of the ensemble. Finally, the verification system correctly indicates that variations are not significant when two theoretically similar EPSs are compared.
Ensemble prediction systems (EPSs) are usually validated under the assumption that the verifying observations are exact. In this paper, two methods are considered for taking observation errors into account. In the 'perturbed-ensemble' method, which has already been studied by other authors, the predicted ensemble elements are randomly perturbed in a way that is consistent with the assumed observation error. In the 'observational-probability' method, which is new, a verifying observation is considered as defining, together with the assumed associated error, a probability distribution. All standard scores for evaluation of EPSs (reliability diagram, Brier score, ranked probability score (RPS), continuous RPS (CRPS), relative-operating-characteristics (ROC) curve area), with the exception of the rank histogram, remain defined in this second method. In particular, the classical reliability-resolution decomposition of the Brier score, and of its extension to the RPS and CRPS, remain defined.Numerical simulations, partially supported by theoretical considerations, show that, with respect to the case when observation errors are ignored, the perturbed-ensemble method improves reliability, as well as the ROC score, while it has no significant impact on resolution, as measured by the Brier score. The observational-probability method, on the other hand, degrades reliability and the ROC score, but improves resolution.With respect to the 'real' performance of the system (i.e. the one that would be diagnosed if no error were present), reliability is unchanged in the perturbed-ensemble method, while resolution and the ROC score are degraded. The observational-probability method degrades reliability and the ROC score. As for resolution, an optimum value of the observational error is found, below which resolution is improved.Diagnostics performed on the operational EPS of the Canadian Meteorological Centre confirm the results of the simulations as to the consequences of ignoring observation errors, or on the contrary of taking them into account through either of the two methods. The significance of those various results is discussed. This article replaces a previously published version (Q.
A regional ensemble prediction system (REPS) with the limited-area version of the Canadian Global Environmental Multiscale (GEM) model at 15-km horizontal resolution is developed and tested. The total energy norm singular vectors (SVs) targeted over northeastern North America are used for initial and boundary perturbations. Two SV perturbation strategies are tested: dry SVs with dry simplified physics and moist SVs with simplified physics, including stratiform condensation and convective precipitation as well as dry processes. Model physics uncertainties are partly accounted for by stochastically perturbing two parameters: the threshold vertical velocity in the trigger function of the Kain-Fritsch deep convection scheme, and the threshold humidity in the Sundqvist explicit scheme. The perturbations are obtained from firstorder Markov processes. Short-range ensemble forecasts in summer with 16 members are performed for five different experiments. The experiments employ different perturbation and piloting strategies, and two different surface schemes. Verification focuses on quantitative precipitation forecasts and is done using a range of probabilistic measures. Results indicate that using moist SVs instead of dry SVs has a stronger impact on precipitation than on dynamical fields. Forecast skill for precipitation is greatly influenced by the dominant synoptic weather systems. For stratiform precipitation caused by strong baroclinic systems, the forecast skill is improved in the moist SV experiments relative to the dry SV experiments. For convective precipitation rates in the range 15-50 mm (24 h) Ϫ1 produced by weak synoptic baroclinic systems, all experiments exhibit noticeably poorer forecast skills. Skill improvements due to the Interactions between Soil, Biosphere, and Atmosphere (ISBA) surface scheme and stochastic perturbations are also observed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.