A sufficient condition to polynomially compute a minimum separating DFA

Parga, Manuel Vázquez de; Garcia, Pedro Castillo; López, Damián

doi:10.1016/j.ins.2016.07.053

Cited by 1 publication

(1 citation statement)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We therefore turned to algorithms for inferring FSMs from sets of positive and negative examples. This is a well-studied NP-hard problem [3,1] (see, e.g., [4,11] for recent discussions) for which there are several practical heuristics. We implemented a version of Evidence-Driven State Merging (EDSM) [10], modified such that it tries to merge only states at the same distance from the initial state, so the resulting FSM should be easier to understand.…”

Section: Finite-state Machinesmentioning

confidence: 99%

An Experiment in Learning the Language of Sequence Motifs: Sequence Logos vs. Finite-State Machines

Francisco

Gagie

Kempa

et al. 2017

Preprint

View full text Add to dashboard Cite

Abstract. Position weight matrices (PWMs) are the standard way to model binding site affinities in bioinformatics. However, they assume that symbol occurrences are position independent and, hence, they do not take into account symbols co-occurrence at different sequence positions. To address this problem, we propose to construct finite-state machines (FSMs) instead. A modified version of the Evidence-Driven State Merging (EDSM) heuristic is used to reduce the number of states as FSMs grow too quickly as a function of the number of sequences to reveal any useful structure. We tested our approach on sequence data for the transcription factor HNF4 and found out that the constructed FSMs provide small representations and an intuitive visualization. Furthermore, the FSM was better than PWMs at discriminating the positive and negative sequences in our data set.

show abstract