Data assimilation (or Bayesian filtering) is a statistical method to find the conditional distribution of the hidden variables of interest given noisy observations from nature. In application, the hidden variables of interest can be the state variables that are directly or indirectly observed or can even be some unobserved parameters in the models. In practice, data assimilation is typically realized by numerical schemes that produce conditional statistics of the state variables of interests, accounting for the information from the observations, rather than the corresponding conditional distribution; this gives a reasonable justification why we called it a "statistical method". When observations are available at discrete times, Bayesian filtering is an iterative predictor-corrector scheme that adjusts the prior forecast (background) statistical estimates from a predictor (or dynamical model) to be more consistent with the current observations. This correction step is referred to as analysis in the atmospheric and ocean science (AOS) community. Subsequently, the posterior (corrected or analysis) statistical estimates are fed into the model as initial conditions for future time prior statistical estimates.While the typical problems of interest are nonlinear, non-Gaussian, and high-dimensional, many practical data assimilation schemes that are currently used rely on Gaussian and/or linear assumptions. In particular, most practical data assimilation schemes are some type of approximation of the celebrated Kalman filter [1], which is the optimal solution (in the least squares sense) of the Bayesian filtering problem under linear and Gaussian assumptions. Essentially, all of these approximations were introduced to reduce the computational cost and to improve the statistical predictions. For example, in the AOS data assimilation community, two important schemes are: (i) the ensemble Kalman filtering methods [2,3,4,5,6,7,8,9, 10] which rely on empirical statistical estimates from ensemble forecasts; (ii) variational-based methods [11,12,13,14] that rely on linear tangent and adjoint models. Operationally, most of the weather prediction centers, including the European Center for Medium-range Weather Forecasts (ECMWF), the UK Met Office, and the National Centers for Environmental Prediction (NCEP), are adopting hybrid approaches, taking advantages from both the ensemble and variational based methods [15,16,17,18,19]. While these approximations were introduced for practical consideration, theoretical understanding of the convergence of these methods in idealistic settings were established [20,21,22] for ensemble Kalman filter and for variational based methods [23,24,25]. We should also mention that while these approximate methods can provide reasonable estimates of the first-order statistics, recent study [26] suggested that one should be cautious in interpreting their second order statistical estimates.In the numerical weather forecasting applications, these approximate filters are routinely used to assimilate observation...