Ordering the collection of states of a given automaton starting from an order of the underlying alphabet is a natural move towards a computational treatment of the language accepted by the automaton. Along this path, Wheeler graphs have been recently introduced as an extension/adaptation of the Burrows-Wheeler Transform (the now famous BWT, originally defined on strings) to graphs. These graphs constitute an important data-structure for languages, since they allow a very efficient storage mechanism for the transition function of an automaton, while providing a fast support to all sorts of substring queries. This is possible as a consequence of a property-the so-called path coherence-valid on Wheeler graphs and consisting in an ordering on nodes that "propagates" to (collections of) strings. By looking at a Wheeler graph as an automaton, the ordering on strings corresponds to the co-lexicographic order of the words entering each state. This leads naturally to consider the class of regular languages accepted by Wheeler automata, i.e. the Wheeler languages. It has been shown that, as opposed to the general case, the classic determinization by powerset construction is polynomial on Wheeler languages. As a consequence, most of the classical problems turn out to be "easy"that is, solvable in polynomial time-on Wheeler languages. Moreover, deciding whether a DFA is Wheeler and deciding whether a DFA accepts a Wheeler language is polynomial. Our contribution here is to put an upper bound to easy problems. For instance, whenever we generalize by switching to general NFAs or by not fixing an order of the underlying alphabet, the above mentioned problems become "hard"-that is NP-complete or even PSPACE-complete.
Given an order of the underlying alphabet we can lift it to the states of a finite deterministic automaton: to compare states we use the order of the strings reaching them. When the order on strings is the co-lexicographic one and this order turns out to be total, the DFA is called Wheeler. This recently introduced class of automata-the Wheeler automata-constitute an important data-structure for languages, since it allows the design and implementation of a very efficient tool-set of storage mechanisms for the transition function, supporting a large variety of substring queries. In this context it is natural to consider the class of regular languages accepted by Wheeler automata, i.e. the Wheeler languages. An inspiring result in this area is the following: it has been shown that, as opposed to the general case, the classic determinization by powerset construction is polynomial on Wheeler automata. As a consequence, most classical problems, when considered on this class of automata, turn out to be "easy"-that is, solvable in polynomial time. In this paper we consider computational problems related to Wheelerness, but starting from non-deterministic automata. We also consider the case of reduced non-deterministic ones-a class of NFA where recognizing Wheelerness is still polynomial, as for DFA's. Our collection of results shows that moving towards non-determinism is, in most cases, a dangerous path leading quickly to intractability. Moreover, we start a study of "state complexity" related to Wheeler DFA and languages, proving that the classic construction for the intersection of languages turns out to be computationally simpler on Wheeler DFA than in the general case. We also provide a construction for the minimum Wheeler DFA recognizing a given Wheeler language.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.