2008
DOI: 10.1007/978-3-540-87700-4_55
|View full text |Cite
|
Sign up to set email alerts
|

Preventing Premature Convergence in a Simple EDA Via Global Step Size Setting

Abstract: Abstract. When a simple real-valued estimation of distribution algorithm (EDA) with Gaussian model and maximum likelihood estimation of parameters is used, it converges prematurely even on the slope of the fitness function. The simplest way of preventing premature convergence by multiplying the variance estimate by a constant factor k each generation is studied. Recent works have shown that when increasing the dimensionality of the search space, such an algorithm becomes very quickly unable to traverse the slo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2009
2009
2015
2015

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 23 publications
(13 citation statements)
references
References 15 publications
(23 reference statements)
0
13
0
Order By: Relevance
“…UMDA [14], Compact Genetic Algorithm [10], Population-Based Incremental Learning [1], Relative Entropy [13], CrossEntropy [5] and Estimation of Multivariate Normal Algorithms (EMNA) [11] (our main inspiration), which combine (i) the current distribution (possibly), (ii) statistical properties of selected points, into a new distribution. We show in this paper that forgetting the old estimate and only using the new points is a good idea in the case of λ large; in particular, premature convergence as pointed out in [20,7,12,15] does not occur if λ >> 1 points are distributed on the search space with non-degenerated variance, and troubles around variance estimates for small sample size as in [6] are by definition not relevant for us. Its advantages are as follows for λ large: (i) it's very simple and parameter free; the reduced number of parameters is an advantage of mutative self adaptation in front of cumulative step-size adaptation, but we show here that yet fewer parameters (0!)…”
Section: Statisticalmentioning
confidence: 90%
“…UMDA [14], Compact Genetic Algorithm [10], Population-Based Incremental Learning [1], Relative Entropy [13], CrossEntropy [5] and Estimation of Multivariate Normal Algorithms (EMNA) [11] (our main inspiration), which combine (i) the current distribution (possibly), (ii) statistical properties of selected points, into a new distribution. We show in this paper that forgetting the old estimate and only using the new points is a good idea in the case of λ large; in particular, premature convergence as pointed out in [20,7,12,15] does not occur if λ >> 1 points are distributed on the search space with non-degenerated variance, and troubles around variance estimates for small sample size as in [6] are by definition not relevant for us. Its advantages are as follows for λ large: (i) it's very simple and parameter free; the reduced number of parameters is an advantage of mutative self adaptation in front of cumulative step-size adaptation, but we show here that yet fewer parameters (0!)…”
Section: Statisticalmentioning
confidence: 90%
“…This means that it is more prone to producing values that fall far from its mean, thus, giving Cauchy more chance of sampling further down its tail than the Gaussian. This gives Cauchy a higher chance of escaping premature convergence than the Gaussian [6], [10].…”
Section: Differences Between Multivariate Gaussian and Multivariate Cmentioning
confidence: 99%
“…The premature convergence of classical Gaussian EDA attracted many efforts geared towards solving this problem [6], [10]. This paper presents the usage of Multivariate Cauchy distribution, an extension of [10] with a full matrix valued parameter that encodes dependencies between the search variable as an alternative search operator in EDA.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations