[1] Spatially distributed hydrologic models are increasingly being used to study and predict soil moisture flow, groundwater recharge, surface runoff, and river discharge. The usefulness and applicability of such complex models is increasingly held back by the potentially many hundreds (thousands) of parameters that require calibration against some historical record of data. The current generation of search and optimization algorithms is typically not powerful enough to deal with a very large number of variables and summarize parameter and predictive uncertainty. We have previously presented a general-purpose Markov chain Monte Carlo (MCMC) algorithm for Bayesian inference of the posterior probability density function of hydrologic model parameters. This method, entitled differential evolution adaptive Metropolis (DREAM), runs multiple different Markov chains in parallel and uses a discrete proposal distribution to evolve the sampler to the posterior distribution. The DREAM approach maintains detailed balance and shows excellent performance on complex, multimodal search problems. Here we present our latest algorithmic developments and introduce MT-DREAM (ZS) , which combines the strengths of multiple-try sampling, snooker updating, and sampling from an archive of past states. This new code is especially designed to solve high-dimensional search problems and receives particularly spectacular performance improvement over other adaptive MCMC approaches when using distributed computing. Four different case studies with increasing dimensionality up to 241 parameters are used to illustrate the advantages of MT-DREAM (ZS) .Citation: Laloy, E., and J. A. Vrugt (2012), High-dimensional posterior exploration of hydrologic models using multiple-try DREAM (ZS) and high-performance computing, Water Resour. Res., 48, W01526,
Probabilistic inversion within a multiple‐point statistics framework is often computationally prohibitive for high‐dimensional problems. To partly address this, we introduce and evaluate a new training‐image based inversion approach for complex geologic media. Our approach relies on a deep neural network of the generative adversarial network (GAN) type. After training using a training image (TI), our proposed spatial GAN (SGAN) can quickly generate 2‐D and 3‐D unconditional realizations. A key characteristic of our SGAN is that it defines a (very) low‐dimensional parameterization, thereby allowing for efficient probabilistic inversion using state‐of‐the‐art Markov chain Monte Carlo (MCMC) methods. In addition, available direct conditioning data can be incorporated within the inversion. Several 2‐D and 3‐D categorical TIs are first used to analyze the performance of our SGAN for unconditional geostatistical simulation. Training our deep network can take several hours. After training, realizations containing a few millions of pixels/voxels can be produced in a matter of seconds. This makes it especially useful for simulating many thousands of realizations (e.g., for MCMC inversion) as the relative cost of the training per realization diminishes with the considered number of realizations. Synthetic inversion case studies involving 2‐D steady state flow and 3‐D transient hydraulic tomography with and without direct conditioning data are used to illustrate the effectiveness of our proposed SGAN‐based inversion. For the 2‐D case, the inversion rapidly explores the posterior model distribution. For the 3‐D case, the inversion recovers model realizations that fit the data close to the target level and visually resemble the true model well.
[1] This study reports on two strategies for accelerating posterior inference of a highly parameterized and CPU-demanding groundwater flow model. Our method builds on previous stochastic collocation approaches, e.g., Marzouk and Xiu (2009) and Marzouk and Najm (2009), and uses generalized polynomial chaos (gPC) theory and dimensionality reduction to emulate the output of a large-scale groundwater flow model. The resulting surrogate model is CPU efficient and serves to explore the posterior distribution at a much lower computational cost using two-stage MCMC simulation. The case study reported in this paper demonstrates a two to five times speed-up in sampling efficiency.
Abstract. Recently, deep learning (DL) has emerged as a revolutionary and versatile tool transforming industry applications and generating new and improved capabilities for scientific discovery and model building. The adoption of DL in hydrology has so far been gradual, but the field is now ripe for breakthroughs. This paper suggests that DL-based methods can open up a complementary avenue toward knowledge discovery in hydrologic sciences. In the new avenue, machine-learning algorithms present competing hypotheses that are consistent with data. Interrogative methods are then invoked to interpret DL models for scientists to further evaluate. However, hydrology presents many challenges for DL methods, such as data limitations, heterogeneity and co-evolution, and the general inexperience of the hydrologic field with DL. The roadmap toward DL-powered scientific advances will require the coordinated effort from a large community involving scientists and citizens. Integrating process-based models with DL models will help alleviate data limitations. The sharing of data and baseline models will improve the efficiency of the community as a whole. Open competitions could serve as the organizing events to greatly propel growth and nurture data science education in hydrology, which demands a grassroots collaboration. The area of hydrologic DL presents numerous research opportunities that could, in turn, stimulate advances in machine learning as well.
Efficient and high-fidelity prior sampling and inversion for complex geological media is still a largely unsolved challenge. Here, we use a deep neural network of the variational autoencoder type to construct a parametric low-dimensional base model parameterization of complex binary geological media. For inversion purposes, it has the attractive feature that random draws from an uncorrelated standard normal distribution yield model realizations with spatial characteristics that are in agreement with the training set. In comparison with the most commonly used parametric representations in probabilistic inversion, we find that our dimensionality reduction (DR) approach outperforms principle component analysis (PCA), optimization-PCA (OPCA) and discrete cosine transform (DCT) DR techniques for unconditional geostatistical simulation of a channelized prior model. For the considered examples, important compression ratios (200 -500) are achieved. Given that the construction of our parameterization requires a training set of several tens of thousands of prior model realizations, our DR approach is more suited for probabilistic (or deterministic) inversion than for unconditional (or point-conditioned) geostatistical simulation. Probabilistic inversions of 2D steady-state and 3D transient hydraulic tomography data are used to demonstrate the DR-based inversion. For the 2D case study, the performance is superior compared to current state-of-theart multiple-point statistics inversion by sequential geostatistical resampling (SGR).
We present a Bayesian inversion method for the joint inference of high-dimensional multiGaussian hydraulic conductivity fields and associated geostatistical parameters from indirect hydrological data. We combine Gaussian process generation via circulant embedding to decouple the variogram from grid cell specific values, with dimensionality reduction by interpolation to enable Markov chain Monte Carlo (MCMC) simulation. Using the Matern variogram model, this formulation allows inferring the conductivity values simultaneously with the field smoothness (also called Matern shape parameter) and other geostatistical parameters such as the mean, sill, integral scales and anisotropy direction(s) and ratio(s). The proposed dimensionality reduction method systematically honors the underlying variogram and is demonstrated to achieve better performance than the Karhunen-Loè ve expansion. We illustrate our inversion approach using synthetic (error corrupted) data from a tracer experiment in a fairly heterogeneous 10,000-dimensional 2-D conductivity field. A 40-times reduction of the size of the parameter space did not prevent the posterior simulations to appropriately fit the measurement data and the posterior parameter distributions to include the true geostatistical parameter values. Overall, the posterior field realizations covered a wide range of geostatistical models, questioning the common practice of assuming a fixed variogram prior to inference of the hydraulic conductivity values. Our method is shown to be more efficient than sequential Gibbs sampling (SGS) for the considered case study, particularly when implemented on a distributed computing cluster. It is also found to outperform the method of anchored distributions (MAD) for the same computational budget.
Global probabilistic inversion within the latent space learned by a Generative Adversarial Network (GAN) has been recently demonstrated. Compared to inversion on the original model space, using the latent space of a trained GAN can offer the following benefits: (1) the generated model proposals are geostatistically consistent with the prescribed prior training image (TI), and (2) the parameter space is reduced by orders of magnitude compared to the original model space. Nevertheless, exploring the learned latent space by state-of-the-art Markov chain Monte Carlo (MCMC) methods may still require a large computational effort. As an alternative, parameters in this latent space could possibly be optimized with much less computationally expensive gradient-based methods. This study shows that due to the typically highly nonlinear relationship between the latent space and the associated output space of a GAN, gradient-based deterministic inversion may fail even when considering a linear forward physical model. We tested two deterministic inversion approaches: a quasi-Newton gradient descent using the Adam algorithm and a Gauss-Newton (GN) method that makes use of the Jacobian matrix calculated by finite-differencing. For a channelized binary TI and a synthetic linear crosshole ground penetrating radar (GPR) tomography problem involving 576 measurements with low noise, we observe that when allowing for a total of 10,000 iterations only 13% of the gradient descent trials locate a solution that has the required data misfit. The tested GN inversion was unable to recover a solution with the appropriate data misfit. Our results suggest that deterministic inversion performance strongly depends on the inversion approach, starting model, true reference model, number of iterations and noise realization. In contrast, computationally-expensive probabilistic global optimization based on differential evolution always finds an appropriate solution.
Electrical resistance tomography (ERT) can be used for the noninvasive characterization of soil moisture and soil structural heterogeneity. Any attempt to relate electrical resistivity measurements to soil moisture content or soil bulk density, however, must rely on a “pedo‐electrical” function, i.e., a conductivity model for soils. This study aimed to test five pedo‐electrical models for their ability to reproduce electrical resistivity as measured by ERT in a silt loam soil sample across a range of moisture and bulk density values. The Waxman and Smits model, the Revil model, the volume‐averaging (VA) model, the Rhoades model, and the Mojid model were inverted within a Bayesian framework, thereby identifying not only the optimal parameter set but also parameter uncertainty and its effect on model prediction. The VA model outperformed the other models in terms of both fit and parameter consistency with respect to independent estimates of surface conductivity obtained with published pedotransfer functions. Sensitivity of the electrical resistivity was then studied by means of the calibrated VA model, revealing an approximately 1.5 times higher sensitivity to soil moisture content than to soil bulk density. In addition, the sensitivity of electrical resistivity to soil moisture and soil bulk density was found to increase as soil moisture and bulk density decreased. The VA model calibrated on the basis of resistivity measurements appeared to simulate relatively well the measured soil moisture content for electrical resistivity values <100 Ω m. As opposed to water content, the soil porosity was badly approximated by the model. It appears therefore that ERT is more suitable for detecting heterogeneity in soil water content than differences in soil bulk density.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.