The ensemble Kalman filter (EnKF) is a data assimilation scheme based on the traditional Kalman filter update equation. An ensemble of forecasts are used to estimate the background-error covariances needed to compute the Kalman gain. It is known that if the same observations and the same gain are used to update each member of the ensemble, the ensemble will systematically underestimate analysis-error covariances. This will cause a degradation of subsequent analyses and may lead to filter divergence. For large ensembles, it is known that this problem can be alleviated by treating the observations as random variables, adding random perturbations to them with the correct statistics. Two important consequences of sampling error in the estimate of analysis-error covariances in the EnKF are discussed here. The first results from the analysis-error covariance being a nonlinear function of the backgrounderror covariance in the Kalman filter. Due to this nonlinearity, analysis-error covariance estimates may be negatively biased, even if the ensemble background-error covariance estimates are unbiased. This problem must be dealt with in any Kalman filter-based ensemble data assimilation scheme. A second consequence of sampling error is particular to schemes like the EnKF that use perturbed observations. While this procedure gives asymptotically correct analysis-error covariance estimates for large ensembles, the addition of perturbed observations adds an additional source of sampling error related to the estimation of the observation-error covariances. In addition to reducing the accuracy of the analysis-error covariance estimate, this extra source of sampling error increases the probability that the analysis-error covariance will be underestimated. Because of this, ensemble data assimilation methods that use perturbed observations are expected to be less accurate than those which do not. Several ensemble filter formulations have recently been proposed that do not require perturbed observations. This study examines a particularly simple implementation called the ensemble square root filter, or EnSRF. The EnSRF uses the traditional Kalman gain for updating the ensemble mean but uses a ''reduced'' Kalman gain to update deviations from the ensemble mean. There is no additional computational cost incurred by the EnSRF relative to the EnKF when the observations have independent errors and are processed one at a time. Using a hierarchy of perfect model assimilation experiments, it is demonstrated that the elimination of the sampling error associated with the perturbed observations makes the EnSRF more accurate than the EnKF for the same ensemble size.
Ensemble data assimilation methods assimilate observations using state-space estimation methods and low-rank representations of forecast and analysis error covariances. A key element of such methods is the transformation of the forecast ensemble into an analysis ensemble with appropriate statistics. This transformation may be preformed stochastically by treating observations as random variables, or deterministically by requiring that the updated analysis perturbations satisfy the Kalman filter analysis error covariance equation. Deterministic analysis ensemble updates are implementations of Kalman square-root filters. The nonuniqueness of the deterministic transformation used in square-root Kalman filters provides a framework to compare three recently proposed ensemble data assimilation methods. 2
Rank histograms are a tool for evaluating ensemble forecasts. They are useful for determining the reliability of ensemble forecasts and for diagnosing errors in its mean and spread. Rank histograms are generated by repeatedly tallying the rank of the verification (usually an observation) relative to values from an ensemble sorted from lowest to highest. However, an uncritical use of the rank histogram can lead to misinterpretations of the qualities of that ensemble. For example, a flat rank histogram, usually taken as a sign of reliability, can still be generated from unreliable ensembles. Similarly, a U-shaped rank histogram, commonly understood as indicating a lack of variability in the ensemble, can also be a sign of conditional bias. It is also shown that flat rank histograms can be generated for some model variables if the variance of the ensemble is correctly specified, yet if covariances between model grid points are improperly specified, rank histograms for combinations of model variables may not be flat. Further, if imperfect observations are used for verification, the observational errors should be accounted for, otherwise the shape of the rank histogram may mislead the user about the characteristics of the ensemble. If a statistical hypothesis test is to be performed to determine whether the differences from uniformity of rank are statistically significant, then samples used to populate the rank histogram must be located far enough away from each other in time and space to be considered independent.
A hybrid ensemble Kalman filter-three-dimensional variational (3DVAR) analysis scheme is demonstrated using a quasigeostrophic model under perfect-model assumptions. Four networks with differing observational densities are tested, including one network with a data void. The hybrid scheme operates by computing a set of parallel data assimilation cycles, with each member of the set receiving unique perturbed observations. The perturbed observations are generated by adding random noise consistent with observation error statistics to the control set of observations. Background error statistics for the data assimilation are estimated from a linear combination of time-invariant 3DVAR covariances and flow-dependent covariances developed from the ensemble of short-range forecasts. The hybrid scheme allows the user to weight the relative contributions of the 3DVAR and ensemble-based background covariances.The analysis scheme was cycled for 90 days, with new observations assimilated every 12 h. Generally, it was found that the analysis performs best when background error covariances are estimated almost fully from the ensemble, especially when the ensemble size was large. When small-sized ensembles are used, some lessened weighting of ensemble-based covariances is desirable. The relative improvement over 3DVAR analyses was dependent upon the observational data density and norm; generally, there is less improvement for data-rich networks than for data-poor networks, with the largest improvement for the network with the data void. As expected, errors depend on the size of the ensemble, with errors decreasing as more ensemble members are added. The sets of initial conditions generated from the hybrid are generally well calibrated and provide an improved set of initial conditions for ensemble forecasts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.