This paper deals with the problem of quantifying the impact of model misspecification when computing general expected values of interest. The methodology that we propose is applicable in great generality, in particular, we provide examples involving path-dependent expectations of stochastic processes. Our approach consists in computing bounds for the expectation of interest regardless of the probability measure used, as long as the measure lies within a prescribed tolerance measured in terms of a flexible class of distances from a suitable baseline model. These distances, based on optimal transportation between probability measures, include Wasserstein's distances as particular cases. The proposed methodology is well-suited for risk analysis, as we demonstrate with a number of applications. We also discuss how to estimate the tolerance region non-parametrically using Skorokhod-type embeddings in some of these applications.
Let (Xn : n ≥ 0) be a sequence of i.i.d. r.v.'s with negative mean. Set S0 = 0 and define Sn = X1 + · · · + Xn. We propose an importance sampling algorithm to estimate the tail of M = max{Sn : n ≥ 0} that is strongly efficient for both light and heavy-tailed increment distributions. Moreover, in the case of heavy-tailed increments and under additional technical assumptions, our estimator can be shown to have asymptotically vanishing relative variance in the sense that its coefficient of variation vanishes as the tail parameter increases. A key feature of our algorithm is that it is state-dependent. In the presence of light tails, our procedure leads to Siegmund's (1979) algorithm. The rigorous analysis of efficiency requires new Lyapunov-type inequalities that can be useful in the study of more general importance sampling algorithms.We say that an unbiased simulation estimator RStrong efficiency implies that the number of simulation runs required to estimate P (M > b) to a given relative accuracy is bounded in b. A weaker criterion is logarithmic efficiency, which implies that the number of replications required to estimate P (M > b) with a given relative accuracy grows at rate o(| log P (M > b)|); see Asmussen and Glynn (2007), Juneja and Shahabuddin (2006) or Bucklew (2004), Section 5.2, for a discussion of efficiency in rare-event simulation. A strongly efficient estimator is said to exhibit asymptotically vanishing relative error when ER(b) 2 ∼ P (M > b) 2 as b ր ∞ (or, equivalently, when the coefficient of variation vanishes as b ր ∞).In this paper we develop an implementable state-dependent importance sampling algorithm that can be rigorously proved to possess asymptotically vanishing relative error. By "state-dependent," we mean that the importance sampling algorithm generates the next increment of the random walk from a distribution that depends on the walk's current state (i.e., location). This is the first strongly efficient algorithm that has been developed for estimating the tail of M in the presence of general heavy-tailed increment distributions. Prior efficient algorithms require the increment distribution to be of M/G/1 type with regularly varying or Weibull type right tails.A key idea is that our importance distribution is state-dependent. There is a long history of applications of state-dependent importance sampling to simulation problems. Perhaps the first related contributions are those by Hammersley and Morton (1954) and Rosenbluth and Rosenbluth (1955) in the context of molecular simulation; see also the text by Liu (2001) for applications of sequential importance sampling in various scientific contexts. However, a general framework for rigorous analysis of these types of algorithms is still under development. In a sequence of recent papers, Paul Dupuis and Hui Wang [see, e.g., Dupuis and Wang (2004)] have proposed a general methodology that can be applied in the presence of large deviations theory for light-tailed systems. Our paper contributes to this general literature by developing Lya...
We show that several machine learning estimators, including square-root LASSO (Least Absolute Shrinkage and Selection) and regularized logistic regression can be represented as solutions to distributionally robust optimization (DRO) problems. The associated uncertainty regions are based on suitably defined Wasserstein distances. Hence, our representations allow us to view regularization as a result of introducing an artificial adversary that perturbs the empirical distribution to account for out-of-sample effects in loss estimation. In addition, we introduce RWPI (Robust Wasserstein Profile Inference), a novel inference methodology which extends the use of methods inspired by Empirical Likelihood to the setting of optimal transport costs (of which Wasserstein distances are a particular case). We use RWPI to show how to optimally select the size of uncertainty regions, and as a consequence, we are able to choose regularization parameters for these machine learning estimators without the use of cross validation. Numerical experiments are also given to validate our theoretical findings.(A1) Management Science and Engineering, Stanford University
We propose a novel framework for analyzing convergence rates of stochastic optimization algorithms with adaptive step sizes. This framework is based on analyzing properties of an underlying generic stochastic process, in particular by deriving a bound on the expected stopping time of this process. We utilize this framework to analyze the bounds on expected global convergence rates of a stochastic variant of a traditional trust region method, introduced in [8]. While traditional trust region methods rely on exact computations of the gradient, Hessian and values of the objective function, this method assumes that these values are available up to some dynamically adjusted accuracy. Moreover, this accuracy is assumed to hold only with some sufficiently large, but fixed, probability, without any additional restrictions on the variance of the errors. This setting applies, for example, to standard stochastic optimization and machine learning formulations. Improving upon the analysis in [8], we show that the stochastic process defined by the algorithm satisfies the assumptions of our proposed general framework, with the stopping time defined as reaching accuracy ∇f (x) ≤ ǫ. The resulting bound for this stopping time is O(ǫ −2 ), under the assumption of sufficiently accurate stochastic gradient, and is the first global complexity bound for a stochastic trust-region method. Finally, we apply the same framework to derive second order complexity bound under some additional assumptions.
Our focus is on the design and analysis of efficient Monte Carlo methods for computing tail probabilities for the suprema of Gaussian random fields, along with conditional expectations of functionals of the fields given the existence of excursions above high levels, b. Na\"{i}ve Monte Carlo takes an exponential, in b, computational cost to estimate these probabilities and conditional expectations for a prescribed relative accuracy. In contrast, our Monte Carlo procedures achieve, at worst, polynomial complexity in b, assuming only that the mean and covariance functions are H\"{o}lder continuous. We also explain how to fine tune the construction of our procedures in the presence of additional regularity, such as homogeneity and smoothness, in order to further improve the efficiency.Comment: Published in at http://dx.doi.org/10.1214/11-AAP792 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org
Assortment planning is an important problem that arises in many industries such as retailing and airlines. One of the key challenges in an assortment planning problem is to identify the “right” model for the substitution behavior of customers from the data. Error in model selection can lead to highly suboptimal decisions. In this paper, we consider a Markov chain based choice model and show that it provides a simultaneous approximation for all random utility based discrete choice models including the multinomial logit (MNL), the probit, the nested logit and mixtures of multinomial logit models. In the Markov chain model, substitution from one product to another is modeled as a state transition in the Markov chain. We show that the choice probabilities computed by the Markov chain based model are a good approximation to the true choice probabilities for any random utility based choice model under mild conditions. Moreover, they are exact if the underlying model is a generalized attraction model (GAM) of which the MNL model is a special case. We also show that the assortment optimization problem for our choice model can be solved efficiently in polynomial time. In addition to the theoretical bounds, we also conduct numerical experiments and observe that the average maximum relative error of the choice probabilities of our model with respect to the true probabilities for any offer set is less than 3% where the average is taken over different offer sets. Therefore, our model provides a tractable approach to choice modeling and assortment optimization that is robust to model selection errors. Moreover, the state transition primitive for substitution provides interesting insights to model the substitution behavior in many real-world applications.
This paper develops the first class of algorithms that enable unbiased estimation of steady-state expectations for multidimensional reflected Brownian motion. In order to explain our ideas, we first consider the case of compound Poisson (possibly Markov modulated) input. In this case, we analyze the complexity of our procedure as the dimension of the network increases and show that, under certain assumptions, the algorithm has polynomial-expected termination time. Our methodology includes procedures that are of interest beyond steady-state simulation and reflected processes. For instance, we use wavelets to construct a piecewise linear function that can be guaranteed to be within ε distance (deterministic) in the uniform norm to Brownian motion in any compact time interval.where J(t) is a column vector with its ith coordinate equal to J i (t). Then equation (1) can be expressed aswhere L(t) is a column vector with its ith coordinate equal toAs mentioned earlier, Y = (Y(t) : t ≥ 0) is a Markov process. Let us assume that Q n → 0 as n → ∞. This assumption is synonymous with the assumption that the network is open. In detail, for each i such that λ i > 0, there exists a path (iIn addition, under this assumption the matrix R −1 exists and has nonnegative coordinates. To ensure stability, we assume that R −1 EX(1) < 0-inequalities involving vectors are understood coordinatewise throughout the paper. It follows from Theorem 2.4 of Kella and Ramasubramanian (2012) The first contribution of this paper is that we develop an exact sampling algorithm (i.e., simulation without bias) for Y(∞). This algorithm is developed in Section 2 of this paper under the assumption that W(k) has a finite moment-generating function. In addition, we analyze the order of computational complexity (measured in terms of expected random numbers generated) of our algorithm as d increases, and we show that it is polynomially bounded.Moreover, we extend our exact sampling algorithm to the case in which there is an independent Markov chain driving the arrival rates, the service rates, and the distribution of job sizes at the time of arrivals. This extension is discussed in Section 3.The workload process (Y(t) : t ≥ 0) is a particular case of a reflected (or constrained) stochastic network. Although the models introduced in the previous paragraphs are interesting in their own right, our main interest is the steady-state simulation techniques for reflected Brownian motion. These techniques are obtained by abstracting the construction formulated in (2). This abstraction is presented in terms of a Skorokhod problem, which we describe as follows. Let X =(X(t) : t ≥ 0) with X(0) ≥ 0, and R be an Mmatrix R so that the inverse R −1 exists and has nonnegative coordinates. To solve the Skorokhod problem requires finding a pair of processes (Y, L) satisfying equation (2), subject to:(i) Y(t) ≥ 0 for each t, (ii) L i (·) nondecreasing for each i ∈ {1, . . . , d} and L i (0) = 0, (iii) t 0 Y i (s) dL i (s) = 0 for each t.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.