Nonlinear stochastic dynamical systems are widely used to model systems across the sciences and engineering. Such models are natural to formulate and can be analyzed mathematically and numerically. However, difficulties associated with inference from time-series data about unknown parameters in these models have been a constraint on their application. We present a new method that makes maximum likelihood estimation feasible for partially-observed nonlinear stochastic dynamical systems (also known as state-space models) where this was not previously the case. The method is based on a sequence of filtering operations which are shown to converge to a maximum likelihood parameter estimate. We make use of recent advances in nonlinear filtering in the implementation of the algorithm. We apply the method to the study of cholera in Bangladesh. We construct confidence intervals, perform residual analysis, and apply other diagnostics. Our analysis, based upon a model capturing the intrinsic nonlinear dynamics of the system, reveals some effects overlooked by previous studies. maximum likelihood ͉ cholera ͉ time series S tate space models have applications in many areas, including signal processing (1), economics (2), cell biology (3), meteorology (4), ecology (5), neuroscience (6), and various others (7-9). Formally, a state space model is a partially observed Markov process. Real-world phenomena are often well modeled as Markov processes, constructed according to physical, chemical, or economic principles, about which one can make only noisy or incomplete observations.It has been noted repeatedly (1, 10) that estimating parameters for state space models is simplest if the parameters are time-varying random variables that can be included in the state space. Estimation of parameters then becomes a matter of reconstructing unobserved random variables, and inference may proceed by using standard techniques for filtering and smoothing. This approach is of limited value if the true parameters are thought not to vary with time, or to vary as a function of measured covariates rather than as random variables. A major motivation for this work has been the observation that the particle filter (9-13) is a conceptually simple, flexible, and effective filtering technique for which the only major drawback was the lack of a readily applicable technique for likelihood maximization in the case of time-constant parameters. The contribution of this work is to show how time-varying parameter algorithms may be harnessed for use in inference in the fixed-parameter case. The key result, Theorem 1, shows that an appropriate limit of time-varying parameter models can be used to locate a maximum of the fixed-parameter likelihood. This result is then used as the basis for a procedure for finding maximum likelihood estimates for previously intractable models.We use the method to further our understanding of the mechanisms of cholera transmission. Cholera is a disease endemic to India and Bangladesh that has recently become reestablished in Africa, s...
The purpose of time series analysis via mechanistic models is to reconcile the known or hypothesized structure of a dynamical system with observations collected over time. We develop a framework for constructing nonlinear mechanistic models and carrying out inference. Our framework permits the consideration of implicit dynamic models, meaning statistical models for stochastic dynamical systems which are specified by a simulation algorithm to generate sample paths. Inference procedures that operate on implicit models are said to have the plug-and-play property. Our work builds on recently developed plug-and-play inference methodology for partially observed Markov models. We introduce a class of implicitly specified Markov chains with stochastic transition rates, and we demonstrate its applicability to open problems in statistical inference for biological systems. As one example, these models are shown to give a fresh perspective on measles transmission dynamics. As a second example, we present a mechanistic analysis of cholera incidence data, involving interaction between two competing strains of the pathogen Vibrio cholerae.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS201 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
We propose an infinitesimal dispersion index for Markov counting processes. We show that, under standard moment existence conditions, a process is infinitesimally (over-) equi-dispersed if, and only if, it is simple (compound), i.e. it increases in jumps of one (or more) unit(s), even though infinitesimally equi-dispersed processes might be under-, equi-or over-dispersed using previously studied indices. Compound processes arise, for example, when introducing continuous-time white noise to the rates of simple processes resulting in Lévy-driven SDEs. We construct multivariate infinitesimally over-dispersed compartment models and queuing networks, suitable for applications where moment constraints inherent to simple processes do not hold. the data typically having additional variance and therefore being termed over-dispersed [26]. The same issues arise in integer-valued stochastic processes [6] and, as a result, there is a considerable literature devoted to extending otherwise appealing models which are unable to reproduce observed variability. Typically, over-dispersion has been studied via defining stochastic processes in which some parameters are themselves modeled as stochastic in order to produce additional variability. This idea has been widely applied since the pioneering work of Greenwood and Yule [14], which derived the over-dispersed negative binomial distribution as a mixture of the Poisson distribution with a gamma-distributed parameter. Another early contribution is the Cox process [7], also known as doubly-stochastic Poisson process [8,30,9]. Some recent work has considered stochastic parameters for continuous-time Markov chains [10] and for non-Markovian processes [32]. Marion and Renshaw [24] and Varughese and Fatti [33] studied overdispersion generated by standard birth-death processes with diffusion-driven rates, focusing on population dynamics applications. Both [24] and [33] proposed a mean-reverting Ornstein-Uhlenbeck process for the driving random environment. Compound counting processes have been studied in the literature on batch processes [28], but we are not aware of a previous investigation of infinitesimal dispersion in this context. To our knowledge, the first general class of infinitesimally over-dispersed MCPs was proposed by Bretó et al. [5]. They achieved over-dispersion by introducing white noise to rates of a multivariate process constructed via simple death processes, which was shown to result in the possibility of simultaneous events. The main goal of this paper is to generalize the model of [5] by presenting a systematic investigation of over-dispersed models via compound MCPs. In particular, those defined by Lévy-driven stochastic differential equations [2] resulting from introducing continuous-time white noise in the rate of simple MCPs via Kolmogorov's differential equations. The applications of MCPs are too diverse to cover systematically here. One concrete example, which has been a motivation for our work [5], is the study of infectious disease dynamics. Discretestate Mark...
Monte Carlo methods to evaluate and maximize the likelihood function enable the construction of confidence intervals and hypothesis tests, facilitating scientific investigation using models for which the likelihood function is intractable. When Monte Carlo error can be made small, by sufficiently exhaustive computation, then the standard theory and practice of likelihood-based inference applies. As datasets become larger, and models more complex, situations arise where no reasonable amount of computation can render Monte Carlo error negligible. We develop profile likelihood methodology to provide frequentist inferences that take into account Monte Carlo uncertainty. We investigate the role of this methodology in facilitating inference for computationally challenging dynamic latent variable models. We present examples arising in the study of infectious disease transmission, demonstrating our methodology for inference on nonlinear dynamic models using genetic sequence data and panel time-series data. We also discuss applicability to nonlinear time-series and spatio-temporal data.
Likelihood-based statistical inference has been considered in most scientific fields involving stochastic modeling. This includes infectious disease dynamics, where scientific understanding can help capture biological processes in so-called mechanistic models and their likelihood functions. However, when the likelihood of such mechanistic models lacks a closed-form expression, computational burdens are substantial. In this context, algorithmic advances have facilitated likelihood maximization, promoting the study of novel data-motivated mechanistic models over the last decade. Reviewing these models is the focus of this paper. In particular, we highlight statistical aspects of these models like overdispersion, which is key in the interface between nonlinear infectious disease modeling and data analysis. We also point out potential directions for further model exploration.
Panel data, also known as longitudinal data, consist of a collection of time series. Each time series, which could itself be multivariate, comprises a sequence of measurements taken on a distinct unit. Mechanistic modeling involves writing down scientifically motivated equations describing the collection of dynamic systems giving rise to the observations on each unit. A defining characteristic of panel systems is that the dynamic interaction between units should be negligible. Panel models therefore consist of a collection of independent stochastic processes, generally linked through shared parameters while also having unit-specific parameters. To give the scientist flexibility in model specification, we are motivated to develop a framework for inference on panel data permitting the consideration of arbitrary nonlinear, partially observed panel models. We build on iterated filtering techniques that provide likelihood-based inference on nonlinear partially observed Markov process models for time series data. Our methodology depends on the latent Markov process only through simulation; this plugand-play property ensures applicability to a large class of models. We demonstrate our methodology on a toy example and two epidemiological case studies. We address inferential and computational issues arising due to the combination of model complexity and dataset size.
a b s t r a c tWe model leverage as stochastic but independent of return shocks and of volatility and perform likelihood-based inference via the recently developed iterated filtering algorithm using S&P500 data, contributing new evidence to the still slim empirical support for random leverage variation.
Disease dynamics are modeled at a population level in order to create a conceptual framework to think about the spread and prevention of disease, to make forecasts and policy decisions, and to ask and answer scientific questions concerning disease mechanisms such as discovering relevant covariates. Population models draw on scientific understanding of component processes, such as immunity, duration of infection, and mechanisms of transmission, and investigate how this understanding
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.