Publication informationGenetic Programming and Evolvable Machines, Abstract Grammar formalisms are one of the key representation structures in Computer Science. So it is not surprising that they have also become important as a method for formalizing constraints in Genetic Programming (GP). Practical grammar-based GP systems first appeared in the mid 1990s, and have subsequently become an important strand in GP research and applications. We trace their subsequent rise, surveying the various grammar-based formalisms that have been used in GP and discussing the contributions they have made to the progress of GP. We illustrate these contributions with a range of applications of grammar-based GP, showing how grammar formalisms contributed to the solutions of these problems. We briefly discuss the likely future development of grammar-based GP systems, and conclude with a brief summary of the field.
Software effort estimation (SEE) is a core activity in all software processes and development lifecycles. A range of increasingly complex methods has been considered in the past 30 years for the prediction of effort, often with mixed and contradictory results. The comparative assessment of effort prediction methods has therefore become a common approach when considering how best to predict effort over a range of project types. Unfortunately, these assessments use a variety of sampling methods and error measurements, making comparison with other work difficult. This article proposes an automatically transformed linear model (ATLM) as a suitable baseline model for comparison against SEE methods. ATLM is simple yet performs well over a range of different project types. In addition, ATLM may be used with mixed numeric and categorical data and requires no parameter tuning. It is also deterministic, meaning that results obtained are amenable to replication. These and other arguments for using ATLM as a baseline model are presented, and a reference implementation described and made available. We suggest that ATLM should be used as a baseline of effort prediction quality for all future model comparisons in SEE. . 2015. A baseline model for software effort estimation.
Many engineering problems may be described as a search for one near optimal description amongst many possibilities, given certain constraints. Search techniques, such as genetic programming, seem appropriate to represent many problems. This paper describes a grammatically based learning technique, based upon the genetic programming paradigm, that allows declarative biasing and modi es the bias as the evolution proceeds. The use of bias allows complex problems to be represented and searched eciently.
1. An ecological model was developed using genetic programming (GP) to predict the time-series dynamics of the diatom, Stephanodiscus hantzschii for the lower Nakdong River, South Korea. Eight years of weekly data showed the river to be hypertrophic (chl. a, 45.1 ± 4.19 lg L )1 , mean ± SE, n ¼ 427), and S. hantzschii annually formed blooms during the winter to spring flow period (late November to March). 2. A simple non-linear equation was created to produce a 3-day sequential forecast of the species biovolume, by means of time series optimization genetic programming (TSOGP). Training data were used in conjunction with a GP algorithm utilizing 7 years of limnological variables (1995)(1996)(1997)(1998)(1999)(2000)(2001). The model was validated by comparing its output with measurements for a specific year with severe blooms (1994). The model accurately predicted timing of the blooms although it slightly underestimated biovolume (training r 2 ¼ 0.70, test r 2 ¼ 0.78). The model consisted of the following variables: dam discharge and storage, water temperature, Secchi transparency, dissolved oxygen (DO), pH, evaporation and silica concentration. 3. The application of a five-way cross-validation test suggested that GP was capable of developing models whose input variables were similar, although the data are randomly used for training. The similarity of input variable selection was approximately 51% between the best model and the top 20 candidate models out of 150 in total (based on both Root Mean Squared Error and the determination coefficients for the test data). 4. Genetic programming was able to determine the ecological importance of different environmental variables affecting the diatoms. A series of sensitivity analyses showed that water temperature was the most sensitive parameter. In addition, the optimal equation was sensitive to DO, Secchi transparency, dam discharge and silica concentration. The analyses thus identified likely causes of the proliferation of diatoms in 'river-reservoir hybrids' (i.e. rivers which have the characteristics of a reservoir during the dry season). This result provides specific information about the bloom of S. hantzschii in river systems, as well as the applicability of inductive methods, such as evolutionary computation to river-reservoir hybrid systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.