Many computer models or simulators have probabilistic dependencies between their input variables, which if not accounted for during design selection may result in a large numbers of simulator runs being required for analysis. We propose a method which incorporates known dependencies between input variables into design selection for simulators and demonstrate the benefits of this approach via a simulator for atmospheric dispersion. We quantify the benefit of the new techniques over standard space-filling and Monte Carlo simulation. The proposed methods are adaptations of computer-generated spread and coverage space-filling designs, with "distance" between two input points redefined to include a weight function. This weight function reflects any known multivariate dependencies between input variables and prior information on the design region. The methods can include quantitative and qualitative variables, and different types of prior information. Novel graphical methods, adapted from fraction of design space plots, are used to assess and compare the designs.
Scientific advice to the UK government throughout the COVID-19 pandemic has been informed by ensembles of epidemiological models provided by members of the Scientific Pandemic Influenza group on Modelling. Among other applications, the model ensembles have been used to forecast daily incidence, deaths and hospitalizations. The models differ in approach (e.g. deterministic or agent-based) and in assumptions made about the disease and population. These differences capture genuine uncertainty in the understanding of disease dynamics and in the choice of simplifying assumptions underpinning the model. Although analyses of multi-model ensembles can be logistically challenging when time-frames are short, accounting for structural uncertainty can improve accuracy and reduce the risk of over-confidence in predictions. In this study, we compare the performance of various ensemble methods to combine short-term (14-day) COVID-19 forecasts within the context of the pandemic response. We address practical issues around the availability of model predictions and make some initial proposals to address the shortcomings of standard methods in this challenging situation.
The effective reproduction numberRwas widely accepted as a key indicator during the early stages of the COVID-19 pandemic. In the UK, theRvalue published on the UK Government Dashboard has been generated as a combined value from an ensemble of fourteen epidemiological models via a collaborative initiative between academia and government. In this paper we outline this collaborative modelling approach and illustrate how, by using an established combination method, a combinedRestimate can be generated from an ensemble of epidemiological models. We show that thisRis robust to different model weighting methods and ensemble size and that using heterogeneous data sources for validation increases its robustness and reduces the biases and limitations associated with a single source of data. We discuss howRcan be generated from different data sources and is therefore a good summary indicator of the current dynamics in an epidemic.
Abstract. It is often desirable to build a statistical emulator of a complex computer simulator in order to perform analysis which would otherwise be computationally infeasible. We propose methodology to model multivariate output from a computer simulator taking into account output structure in the responses. The utility of this approach is demonstrated by applying it to a chemical and biological hazard prediction model. Predicting the hazard area which results from an accidental or deliberate chemical or biological release is imperative in civil and military planning and also in emergency response. The hazard area resulting from such a release is highly structured in space, and we therefore propose the use of a thin-plate spline to capture the spatial structure and fit a Gaussian process emulator to the coefficients of the resultant basis functions. We compare and contrast four different techniques for emulating multivariate output: dimension reduction using (i) a fully Bayesian approach with a principal component basis, (ii) a fully Bayesian approach with a thin-plate spline basis, assuming that the basis coefficients are independent, and (iii) a "plug-in" Bayesian approach with a thin-plate spline basis and a separable covariance structure; and (iv) a functional data modeling approach using a tensor-product (separable) Gaussian process. We develop methodology for the two thin-plate spline emulators and demonstrate that these emulators significantly outperform the principal component emulator. Further, the separable thin-plate spline emulator, which accounts for the dependence between basis coefficients, provides substantially more realistic quantification of uncertainty, and is also computationally more tractable, allowing fast emulation. For high resolution output data, it also offers substantial predictive and computational advantages over the tensor-product Gaussian process emulator.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.