The cross entropy method for classification

Mannor, Shie; Peleg, Dori; Rubinstein, Reuven Y.

doi:10.1145/1102351.1102422

Cited by 307 publications

(224 citation statements)

References 16 publications

Supporting

Mentioning

201

Contrasting

Unclassified

Order By: Relevance

“…The CE method has been successfully applied to a diverse range of estimation and optimization problems, including buffer allocation [1], queueing models of telecommunication systems [14,16], optimal control of HIV/AIDS spread [48,49], signal detection [30], combinatorial auctions [9], DNA sequence alignment [24,38], scheduling and vehicle routing [3,8,11,20,23,53], neural and reinforcement learning [31,32,34,52,54], project management [12], rare-event simulation with light-and heavy-tail distributions [2,10,21,28], clustering analysis [4,5,29]. Applications to classical combinatorial optimization problems including the max-cut, traveling salesman, and Hamiltonian cycle 1…”

Section: Introductionmentioning

confidence: 99%

A Tutorial on the Cross-Entropy Method

et al. 2005

Self Cite

View full text Add to dashboard Cite

Abstract:The cross-entropy method is a recent versatile Monte Carlo technique. This article provides a brief introduction to the cross-entropy method and discusses how it can be used for rare-event probability estimation and for solving combinatorial, continuous, constrained and noisy optimization problems. A comprehensive list of references on cross-entropy methods and applications is included.Keywords: cross-entropy, Kullback-Leibler divergence, rare events, importance sampling, stochastic search.The cross-entropy (CE) method is a recent generic Monte Carlo technique for solving complicated simulation and optimization problems. The approach was introduced by R.Y. Rubinstein in [41,42], extending his earlier work on variance minimization methods for rare-event probability estimation [40].The CE method can be applied to two types of problem:, where X is a random variable or vector taking values in some set X and H is function on X . An important special case is the estimation of a probability = P(S(X) γ), where S is another function on X .2. Optimization: Optimize (that is, maximize or minimize) S(x) over all x ∈ X , where S is some objective function on X . S can be either a known or a noisy function. In the latter case the objective function needs to be estimated, e.g., via simulation.In the estimation setting, the CE method can be viewed as an adaptive importance sampling procedure that uses the cross-entropy or Kullback-Leibler divergence as a measure of closeness between two sampling distributions, as is explained further in Section 1. In the optimization setting, the optimization problem is first translated into a rare-event estimation problem, and then the CE method for estimation is used as an adaptive algorithm to locate the optimum, as is explained further in Section 2.An easy tutorial on the CE method is given in [15]. A more comprehensive treatment can be found in [45]; see also [46, Chapter 8]. The CE method homepage can be found at www.cemethod.org .The CE method has been successfully applied to a diverse range of estimation and optimization problems, including buffer allocation [1], queueing models of telecommunication systems [14,16], optimal control of HIV/AIDS spread [48,49], signal detection [30], combinatorial auctions [9], DNA sequence alignment [24,38], scheduling and vehicle routing [3,8,11,20,23,53], neural and reinforcement learning [31,32,34,52,54], project management [12], rare-event simulation with light-and heavy-tail distributions [2,10,21,28], clustering analysis [4,5,29]. Applications to classical combinatorial optimization problems including the max-cut, traveling salesman, and Hamiltonian cycle 1

show abstract

Section: Introductionmentioning

confidence: 99%

A Tutorial on the Cross-Entropy Method

et al. 2005

Self Cite

View full text Add to dashboard Cite

show abstract

“…Finally, Kalyanakrishnan and Stone [28] compare Sarsa and the cross-entropy method [41,71], another approach to policy search, in a simple navigation task. They study how the relative performance of these methods changes with respect to several domain characteristics, including sensor and effector noise.…”

Section: Related Workmentioning

confidence: 99%

“…Other evolutionary methods such as CoSyNE [21], EANT [29], and HyperNEAT [17], an extension to NEAT based on indirect encodings, also deserve closer empirical study. Beyond evolutionary methods, other policy search approaches such as the cross-entropy method [41,71] or policy gradient approaches [3,7,34,70] could be usefully compared with TD methods. Similarly, recent developments in making value function approximation more robust, e.g., least-squares policy iteration [36], fitted Q-iteration [54] and evolutionary function approximation [79], need to be thoroughly compared to the traditional function approximation approach used in this paper.…”

Section: Related Workmentioning

confidence: 99%

Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning

Whiteson

Taylor

Stone

2009

Auton Agent Multi-Agent Syst

View full text Add to dashboard Cite

Temporal difference and evolutionary methods are two of the most common approaches to solving reinforcement learning problems. However, there is little consensus on their relative merits and there have been few empirical studies that directly compare their performance. This article aims to address this shortcoming by presenting results of empirical comparisons between Sarsa and NEAT, two representative methods, in mountain car and keepaway, two benchmark reinforcement learning tasks. In each task, the methods are evaluated in combination with both linear and nonlinear representations to determine their best configurations. In addition, this article tests two specific hypotheses about the critical factors contributing to these methods' relative performance: (1) that sensor noise reduces the final performance of Sarsa more than that of NEAT, because Sarsa's learning updates are not reliable in the absence of the Markov property and (2) that stochasticity, by introducing noise in fitness estimates, reduces the learning speed of NEAT more than that of Sarsa. Experiments in variations of mountain car and keepaway designed to isolate these factors confirm both these hypotheses.

show abstract

“…This search distribution is used to generate a population of individuals which are evaluated by their corresponding fitness values. Subsequently, a new search distribution is computed by either computing gradient based updates [19], expectation-maximisationbased updates [7,12], evolutionary strategies [10], the cross-entropy method [14] or information-theoretic policy updates [1], such that the individuals with higher fitness will have better selection probability. The Covariance Matrix Adaptation -Evolutionary Strategy (CMA-ES) is one of the most popular stochastic search algorithms [10].…”

Section: Introductionmentioning

confidence: 99%

“…However, a unique mathematical framework that explains all these update rules is so far still missing. In contrast, expectation maximisation-based algorithms [7,12,14] (Section 2.2) optimize a clearly defined objective, i.e., the maximization of a lower-bound. The maximisation of lower bound in each iteration is equivalent to weighted maximum likelihood estimation (MLE) of the distribution.…”

Section: Introductionmentioning

confidence: 99%

Deriving and improving CMA-ES with information geometric trust regions

Abdolmaleki

Price

Lau

et al. 2017

Proceedings of the Genetic and Evolutionary Computation Conference

View full text Add to dashboard Cite

CMA-ES is one of the most popular stochastic search algorithms. It performs favourably in many tasks without the need of extensive parameter tuning. The algorithm has many beneficial properties, including automatic step-size adaptation, efficient covariance updates that incorporates the current samples as well as the evolution path and its invariance properties. Its update rules are composed of well established heuristics where the theoretical foundations of some of these rules are also well understood. In this paper we will fully derive all CMA-ES update rules within the framework of expectation-maximisation-based stochastic search algorithms using information-geometric trust regions. We show that the use of the trust region results in similar updates to CMA-ES for the mean and the covariance matrix while it allows for the derivation of an improved update rule for the step-size. Our new algorithm, Trust-Region Covariance Matrix Adaptation Evolution Strategy (TR-CMA-ES) is fully derived from first order optimization principles and performs favourably in compare to standard CMA-ES algorithm.

show abstract

The cross entropy method for classification

Cited by 307 publications

References 16 publications

A Tutorial on the Cross-Entropy Method

A Tutorial on the Cross-Entropy Method

Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning

Deriving and improving CMA-ES with information geometric trust regions

Contact Info

Product

Resources

About