Learning a Deterministic Finite Automaton (DFA) from a training set of labeled strings is a hard task that has been much studied within the machine learning community. It is equivalent to learning a regular language by example and has applications in language modeling. In this paper, we describe a novel evolutionary method for learning DFA that evolves only the transition matrix and uses a simple deterministic procedure to optimally assign state labels. We compare its performance with the Evidence Driven State Merging (EDSM) algorithm, one of the most powerful known DFA learning algorithms. We present results on random DFA induction problems of varying target size and training set density. We also studythe effects of noisy training data on the evolutionary approach and on EDSM. On noise-free data, we find that our evolutionary method outperforms EDSM on small sparse data sets. In the case of noisy training data, we find that our evolutionary method consistently outperforms EDSM, as well as other significant methods submitted to two recent competitions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.