We present an approach to genetic programming difficulty based on a statistical study of program fitness landscapes. The fitness distance correlation is used as an indicator of problem hardness and we empirically show that such a statistic is adequate in nearly all cases studied here. However, fitness distance correlation has some known problems and these are investigated by constructing an artificial landscape for which the correlation gives contradictory indications. Although our results confirm the usefulness of fitness distance correlation, we point out its shortcomings and give some hints for improvement in assessing problem hardness in genetic programming.
This paper presents an investigation of genetic programming fitness landscapes. We propose a new indicator of problem hardness for tree-based genetic programming, called negative slope coefficient, based on the concept of fitness cloud. The negative slope coefficient is a predictive measure, i.e. it can be calculated without prior knowledge of the global optima. The fitness cloud is generated via a sampling of individuals obtained with the Metropolis-Hastings method. The reliability of the negative slope coefficient is tested on a set of well known and representative genetic programming benchmarks, comprising the binomial-3 problem, the even parity problem and the artificial ant on the Santa Fe trail.
Abstract. Negative slope coefficient has been recently introduced and empirically proven a suitable hardness indicator for some well known genetic programming benchmarks, such as the even parity problem, the binomial-3 and the artificial ant on the Santa Fe trail. Nevertheless, the original definition of this measure contains several limitations. This paper points out some of those limitations, presents a new and more relevant definition of the negative slope coefficient and empirically shows the suitability of this new definition as a hardness measure for some genetic programming benchmarks, including the multiplexer, the intertwined spirals problem and the royal trees.
In this paper we study cellular automata (CAs) that perform the computational Majority task. This task is a good example of what the phenomenon of emergence in complex systems is. We take an interest in the reasons that make this particular fitness landscape a difficult one. The first goal is to study landscape as such, and thus it is ideally independent from the actual heuristics used to search the space. However, a second goal is to understand the features a good search technique for this particular problem space should possess. We statistically quantify in various ways the degree of difficulty of searching this landscape. Due to neutrality, investigations based on sampling techniques on the whole landscape are difficult to conduct. So, we go exploring the landscape from the top. Although it has been proved that no CA can perform the task perfectly, several efficient CAs for this task have been found. Exploiting similarities between these CAs and symmetries in the landscape, we define the Olympus landscape which is regarded as the "heavenly home" of the best local optima known (blok). Then we measure several properties of this subspace. Although it is easier to find relevant CAs in this subspace than in the overall landscape, there are structural reasons that prevents a searcher from finding overfitted CAs in the Olympus. Finally, we study dynamics and performances of genetic algorithms on the Olympus in order to confirm our analysis and to find efficient CAs for the Majority problem with low computational cost.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.