patternMinor
Smallest DFA that accepts given strings and rejects other given strings
Viewed 0 times
smallestdfaacceptsthatrejectsandotherstringsgiven
Problem
Given two sets $A,B$ of strings over alphabet $\Sigma$, can we compute the smallest deterministic finite-state automaton (DFA) $M$ such that $A \subseteq L(M)$ and $L(M) \subseteq \Sigma^*\setminus B$?
In other words, $A$ represents a set of positive examples. Every string in $A$ needs to be accepted by the DFA. $B$ represents a set of negative examples. No string in $B$ should be accepted by the DFA.
Is there a way to solve this, perhaps using DFA minimization techniques? I could imagine creating a DFA-like automaton that has three kinds of states: accept states, reject states, and "don't-care" states (any input that ends in a "don't-care" state can be either accepted or rejected). But can we then find a way to minimize this to an ordinary DFA?
You could think of this as the problem of learning a DFA, given positive and negative examples.
This is inspired by Is regex golf NP-Complete?, which asks a similar questions for regexps instead of DFAs.
In other words, $A$ represents a set of positive examples. Every string in $A$ needs to be accepted by the DFA. $B$ represents a set of negative examples. No string in $B$ should be accepted by the DFA.
Is there a way to solve this, perhaps using DFA minimization techniques? I could imagine creating a DFA-like automaton that has three kinds of states: accept states, reject states, and "don't-care" states (any input that ends in a "don't-care" state can be either accepted or rejected). But can we then find a way to minimize this to an ordinary DFA?
You could think of this as the problem of learning a DFA, given positive and negative examples.
This is inspired by Is regex golf NP-Complete?, which asks a similar questions for regexps instead of DFAs.
Solution
There is a lot of literature on learning DFAs given positive and negative samples. If $A$ and $B$ are finite I don't see how the problem would ever be undecidable though. If $A \cap B = \emptyset$ then obviously the DFA that accepts only the strings in $A$ satisfies your requirement and one can simply enumerate all smaller DFAs. If $A \cap B \neq \emptyset$ then clearly no such DFA exists.
Finding the minimum DFA consistent with a given set of strings is NP-complete. This result appears as Theorem 1 in Angluin's paper On the complexity of minimum inference of regular sets. So clearly your problem is also NP-complete.
For lots of good links and discussion on learning regular languages check out the CSTheory blogpost On Learning Regular Languages.
Finding the minimum DFA consistent with a given set of strings is NP-complete. This result appears as Theorem 1 in Angluin's paper On the complexity of minimum inference of regular sets. So clearly your problem is also NP-complete.
For lots of good links and discussion on learning regular languages check out the CSTheory blogpost On Learning Regular Languages.
Context
StackExchange Computer Science Q#19687, answer score: 8
Revisions (0)
No revisions yet.