Gaussian process (GP) emulation is a data-driven method that substitutes a slow simulator with a stochastic approximation. It is then typically orders of magnitude faster than the simulator at the costs of introducing interpolation 10 errors. Our approach, the mechanism-based GP emulator, uses knowledge of the simulator mechanisms in addition to the information gained from previous simulator runs, so called design data. In this study, we investigate how the degree of incorporating mechanisms into the design of the GP emulator influences emulation accuracy. Similarly to the previous results, we get a significant gain 15 in accuracy already when using the simplest approximation of the mechanisms by a single linear reservoir. However, in this case, we again considerably improve emulation accuracy when using the next two approximations. This allows us to decreases the required number of design data to achieve a similar accuracy as a non-mechanistic emulator.
Many simulation-intensive tasks in the applied sciences, such as sensitivity analysis, parameter inference or real time control, are hampered by slow simulators. Emulators provide the opportunity of speeding up simulations at the cost of introducing some inaccuracy. An emulator is a fast approximation to a simulator that interpolates between design input-output pairs of the simulator. Increasing the number of design data sets is a computationally demanding way of improving the accuracy of emulation. We investigate the complementary approach of increasing emulation accuracy by including knowledge about the mechanisms of the simulator into the formulation of the emulator. To approximately reproduce the output of dynamic simulators, we consider emulators that are based on a system of linear, ordinary or partial stochastic differential equations with a noise term formulated as a Gaussian process of the parameters to be emulated. This stochastic model is then conditioned to the design data so that it mimics the behavior of the nonlinear simulator as a function of the parameters. The drift terms of the linear model are designed to provide a simplified description of the simulator as a function of its key parameters so that the required corrections by the conditioned Gaussian process noise are as small as possible. The goal of this paper is to compare the gain in accuracy of these emulators by enlarging the design data set and by varying the degree of simplification of the linear model. We apply this framework to a simulator for the shallow water equations in a channel and compare emulation accuracy for emulators based on different spatial discretization levels of the channel and for a standard non-mechanistic emulator. Our results indicate that we have a large gain in accuracy already when using the simplest mechanistic description by a single linear reservoir to formulate the drift term of the linear model. Adding some more reservoirs does not lead to a significant improvement in accuracy. However, the transition to a spatially continuous linear model leads again to a similarly large gain in accuracy as the transition from the non-mechanistic emulator to that based on one reservoir.
As in many fields of dynamic modeling, the long runtime of hydrological models hinders Bayesian inference of model parameters from data. By replacing a model with an approximation of its output as a function of input and/or parameters, emulation allows us to complete this task by trading-off accuracy for speed. We combine (i) the use of a mechanistic emulator, (ii) low-discrepancy sampling of the parameter space, and (iii) iterative refinement of the design data set, to perform Bayesian inference with a very small design data set constructed with 128 model runs in a parameter space of up to eight dimensions. In our didactic example we use a model implemented with the hydrological simulator SWMM that allows us to compare our inference results against those derived with the full model. This comparison demonstrates that iterative improvements lead to reasonable results with a very small design data set. Keypoints • Mechanistic emulation• Design data points selection • Calibration of hydrological models
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.