Contact and near-contact binary systems - II. RR Cen, EZ Hya, V502 Oph and RS Sct

Consider the problem of sequential sampling from m statistical populations to maximize the expected sum of outcomes in the long run. Under suitable assumptions on the unknown parameters g ⌰, it is shown that there exists a class C of R Ž. adaptive policies with the following properties: i The expected n horizon reward 0 n UF Policies in C are specified via easily computable indices, defined as unique R Ž. solutions to dual problems that arise naturally from the functional form of M. In addition, the assumptions are verified for populations specified by nonparametric discrete univariate distributions with finite support. In the case of normal populations with unknown means and variances, we leave as an open problem the verification of one assumption.

show abstract

The Multi-Armed Bandit Problem: Decomposition and Computation

Katehakis

1987

View full text Add to dashboard Cite

The multi-armed bandit problem arises in sequentially allocating effcot to one of N prefects and sequentially asngning patients to cme of N treatmoits in dinical trials. Gittins ainl Jones (1974) have shown that oae. optintal policy Ua the JV-pn^ect problem, an A^dimensional discounted Maricov dedskm chain, is detennined by tiK following largest-index nde. There is an index for eadi state of eadi given project that depoids oidy on die data of that prcgect In each period one allocates effect to a prcgect with largest current index. The purpose (A this paper is to give a short pnx>f of this result and a new diaracterization of tbe index of a project in state i, viz., as the nmrimiim expected present value in state i for the restart-in-(problem in which, in eadi sute ai«i poiod, mie either amtinues allocating effc»t to the project or immediately restarts the project in state i. lAoKovei, it is shown that an approximate largest-index rule yields an approximatdy optinal parse transition matrices in large state qjaces than have been sugg^ted luretofore. By using a suitable inq>]ementation of successive a|>proximati<»is, a pcdicy whose expected present value is within 100c % of the maximum possible range ctf values of the indices can be found on-liiK with at most (N + T-l)TM q>erations where M is the number of operations required to calculate one approximation, T is the least integer nuycoizing the ratio In e/ln a and 0 < a < 1 is the discount factor.

show abstract

Optimal Adaptive Policies for Markov Decision Processes

1997

View full text Add to dashboard Cite

In this paper we consider the problem of adaptive control for Markov Decision Processes. We give the explicit form for a class of adaptive policies that possess optimal increase rate properties for the total expected finite horizon reward, under sufficient assumptions of finite state-action spaces and irreducibility of the transition law. A main feature of the proposed policies is that the choice of actions, at each state and time period, is based on indices that are inflations of the right-hand side of the estimated average reward optimality equations.

show abstract

A hybrid fuzzy group ANP–TOPSIS framework for assessment of e-government readiness from a CiRM perspective

Tavana

Zandi

Katehakis

2013

Information & Management

View full text Add to dashboard Cite

Sequential choice from several populations.

Katehakis

Robbins

1995

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

show abstract

Effective control policies for stochastic inventory systems with a minimum order quantity and linear costs

Zhou

Zhao

Katehakis

2007

International Journal of Production Economics

View full text Add to dashboard Cite

A Successive Lumping Procedure for a Class of Markov Chains

Katehakis¹,

Smit²

2012

Prob. Eng. Inf. Sci.

View full text Add to dashboard Cite

A class of Markov chains we call successively lumpable is specified for which it is shown that the stationary probabilities can be obtained by successively computing the stationary probabilities of a propitiously constructed sequence of Markov chains. Each of the latter chains has a(typically much) smaller state space and this yields significant computational improvements. We discuss how the results for discrete time Markov chains extend to semi-Markov processes and continuous time Markov processes. Finally, we will study applications of successively lumpable Markov chains to classical reliability and queueing models.

show abstract

Des and Res Processes and Their Explicit Solutions

Katehakis

Smit

Spieksma³

2014

Prob. Eng. Inf. Sci.

View full text Add to dashboard Cite

This paper defines and studies the down entrance state (DES) and the restart entrance state (RES) classes of quasi-skip free (QSF) processes specified in terms of the nonzero structure of the elements of their transition rate matrix Q. A QSF process is a Markov chain with states that can be specified by tuples of the form (m, i), where m ∈ Z represents the "current" level of the state and i ∈ Z + the current phase of the state, and its transition probability matrix Q does not permit one-step transitions to states that are two or more levels away from the current state in one direction of the level variable m. A QSF process is a DES process if and only if one step "down" transitions from a level m can only reach a single state in level m − 1, for all m. A QSF process is a RES process if and only if one step "up" transitions from a level m can only reach a single set of states in the highest level M 2 , largest of all m.We derive explicit solutions and simple truncation bounds for the steady-state probabilities of both DES and RES processes, when in addition Q insures ergodicity. DES and RES processes have applications in many areas of applied probability comprising computer science, queueing theory, inventory theory, reliability, and the theory of branching processes. To motivate their applicability we present explicit solutions for the well-known open problem of the M/Er/n queue with batch arrivals, an inventory model, and a reliability model.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Michael N. Katehakis

Optimal Adaptive Policies for Sequential Allocation Problems

The Multi-Armed Bandit Problem: Decomposition and Computation

Optimal Adaptive Policies for Markov Decision Processes

A hybrid fuzzy group ANP–TOPSIS framework for assessment of e-government readiness from a CiRM perspective

Sequential choice from several populations.

Effective control policies for stochastic inventory systems with a minimum order quantity and linear costs

A Successive Lumping Procedure for a Class of Markov Chains

Des and Res Processes and Their Explicit Solutions

Contact Info

Product

Resources

About