Under a Bayesian framework, we formulate the fully sequential sampling and selection decision in statistical ranking and selection as a stochastic control problem, and derive the associated Bellman equation. Using value function approximation, we derive an approximately optimal allocation policy. We show that this policy is not only computationally efficient but also possesses both one-step-ahead and asymptotic optimality for independent normal sampling distributions. Moreover, the proposed allocation policy is easily generalizable in the approximate dynamic programming paradigm.Index Terms-simulation, ranking and selection, stochastic control, Bayesian, dynamic sampling and selection., where F (·|E a t ) is the posterior distribution of θ conditioned on the information set E a t , and d· in dθ stands for
In this paper, we propose a new unbiased stochastic derivative estimator in a framework that can handle discontinuous sample performances with structural parameters. This work extends the three most popular unbiased stochastic derivative estimators: (1) infinitesimal perturbation analysis (IPA), (2) the likelihood ratio (LR) method, and (3) the weak derivative method, to a setting where they did not previously apply. Examples in probability constraints, control charts, and financial derivatives demonstrate the broad applicability of the proposed framework. The new estimator preserves the singlerun efficiency of the classic IPA-LR estimators in applications, which is substantiated by numerical experiments.
We formulate the statistical selection problem in a general dynamic framework comprising fully sequential sampling allocation and optimal design selection. Because the traditional probability of correct selection measure is not sufficient to capture both aspects in this more general framework, we introduce the integrated probability of correct selection to better characterize the objective. As a result, the usual selection policy of choosing the design with the largest sample mean as the estimate of the best is no longer necessarily optimal. Rather, the optimal selection policy is to choose the design that maximizes the posterior integrated probability of correct selection, which is a function of the posterior mean and the correlation structure induced by the posterior variance. Because determining the optimal selection policy is generally intractable, we also devise an approximation scheme to efficiently approximate the optimal selection policy. For the allocation policy, we study an asymptotic policy called general Bayesian budget allocation, which is comprised of a sampling statistic and a sequential rule. The optimal computing budget allocation algorithm can be interpreted as a special case of the asymptotical sampling statistics. Numerical examples are provided to illustrate the potential performance improvements, especially in small sample behavior.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.