Two classical non-deterministic automata recognize the language denoted by a regular expression: the position automaton which deduces from the position sets defined by Glushkov and McNaughton–Yamada, and the equation automaton which can be computed via Mirkin's prebases or Antimirov's partial derivatives. Let |E| be the size of the expression and ‖E‖ be its alphabetic width, i.e. the number of symbol occurrences. The number of states in the equation automaton is less than or equal to the number of states in the position automaton, which is equal to ‖E‖+1. On the other hand, the worst-case time complexity of Antimirov algorithm is O(‖E‖3· |E|2), while it is only O(‖E‖·|E|) for the most efficient implementations yielding the position automaton (Brüggemann–Klein, Chang and Paige, Champarnaud et al.). We present an O(|E|2) space and time algorithm to compute the equation automaton. It is based on the notion of canonical derivative which makes it possible to efficiently handle sets of word derivatives. By the way, canonical derivatives also lead to a new O(|E|2) space and time algorithm to construct the position automaton.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.