patternMinor
Brzozowski's algorithm for DFA minimization
Viewed 0 times
dfabrzozowskiminimizationalgorithmfor
Problem
Brzozowski's DFA minimization algorithm builds a minimal DFA for DFA $G$ by:
Of course, since some DFA's have an exponential large reverse DFA, this algorithm runs in exponential time in worst case in terms of the size of the input, so lets keep track of the size of the reverse DFA.
If $N$ is the size of the input DFA, $n$ is the size of the minimal DFA, and $m$ the size of the minimal reverse DFA, then what is the run time of Brzozowski's algorithm in terms of $N$,$n$, and $m$?
In particular, under what relationship between $n$ and $m$ does Brzozowski's algorithm outperform Hopcroft's or Moore's algorithms?
I have heard that on typical examples in practice/application, Brzozowski's algorithm outperforms the others. Informally, what are these typical examples like?
- reversing all the edges in $G$, making the initial state an accept state, and the accept states initial, to get an NFA $N'$ for the reverse language,
- using powerset construction to get $G'$ for the reverse language,
- reversing the edges (and initial-accept swap) in $G'$ to get an NFA $N$ for the original language, and
- doing powerset construction to get $G_{\min}$.
Of course, since some DFA's have an exponential large reverse DFA, this algorithm runs in exponential time in worst case in terms of the size of the input, so lets keep track of the size of the reverse DFA.
If $N$ is the size of the input DFA, $n$ is the size of the minimal DFA, and $m$ the size of the minimal reverse DFA, then what is the run time of Brzozowski's algorithm in terms of $N$,$n$, and $m$?
In particular, under what relationship between $n$ and $m$ does Brzozowski's algorithm outperform Hopcroft's or Moore's algorithms?
I have heard that on typical examples in practice/application, Brzozowski's algorithm outperforms the others. Informally, what are these typical examples like?
Solution
Most of the below is from Parsing Theory by Sippu and Soisalon-Soininen.
Let $Q$ be the set of states of the DFA. Let $T$ be the input alphabet. Let $|M|=O(|T| \cdot |Q|)$ be the size of the machine. Exercise 3.40 gives a $O(|T|\cdot|Q|^2)$ algorithm for state minimization. As Wikipedia describes, Hopcroft's algorithm has a running time of $O(|T| \cdot |Q| \cdot \log|T|)$ and Moore's algorithm has a running time of $O(|T|^2 \cdot |Q|)$.
Theorem 3.30 states that the subset construction can be done in $O(2^{|T|+\log|T|+\log|Q|})$ yielding an automaton of size $O(2^{|T|+\log|Q|})$ (actually, if the resulting automaton has $|T'|$ states, the running time is $(|T'| + |T|\cdot|M|) \cdot |Q|$). The two reversals and the second determinisation are therefore inconsequential in the running time, so the asymptotic running time of Brzozowski's algorithm is the same as that of the subset construction.
This means that in the worst case, Brzozowski's algorithm is exponentially slower than the other three algorithms. Note that the worst case really does occur: the classic example of the NFA for the language $(a|b)^*a^k$ has $k+1$ states and its corresponding minimal DFA has $O(2^k)$ states, while the reverse of the NFA is deterministic, so running Brzozowski's algorithm on this reversed NFA triggers the worst-case behavior.
However, if the subset construction yields an automata of size $|T'|=O(|T|)$, then its running time is also $O(|T|^2 \cdot |Q|^2)$, which is often the case on real-life inputs. Furthermore, if proper care is taken when computing the closure of a state, then this can be done much faster in most cases (that is, cases where the closure is small), saving a factor $|T|$ in practice (for essentially the same reason that transitive closures can be computed quite quickly on real-world examples). Furthermore, if the input and intermediate automatons are sparse, which means that states have few transitions, then a factor $|Q|$ is saved, which gives a $O(|T| \cdot |Q|)$ running time on 'good' inputs.
Unfortunately, I'm not familiar enough with Hopcroft's or Moore's algorithms to give an analysis of their running times in typical cases. Wikipedia talks about a $O(|T| \log \log |T|)$ running time in some cases, which would make the three algorithms comparable.
Let $Q$ be the set of states of the DFA. Let $T$ be the input alphabet. Let $|M|=O(|T| \cdot |Q|)$ be the size of the machine. Exercise 3.40 gives a $O(|T|\cdot|Q|^2)$ algorithm for state minimization. As Wikipedia describes, Hopcroft's algorithm has a running time of $O(|T| \cdot |Q| \cdot \log|T|)$ and Moore's algorithm has a running time of $O(|T|^2 \cdot |Q|)$.
Theorem 3.30 states that the subset construction can be done in $O(2^{|T|+\log|T|+\log|Q|})$ yielding an automaton of size $O(2^{|T|+\log|Q|})$ (actually, if the resulting automaton has $|T'|$ states, the running time is $(|T'| + |T|\cdot|M|) \cdot |Q|$). The two reversals and the second determinisation are therefore inconsequential in the running time, so the asymptotic running time of Brzozowski's algorithm is the same as that of the subset construction.
This means that in the worst case, Brzozowski's algorithm is exponentially slower than the other three algorithms. Note that the worst case really does occur: the classic example of the NFA for the language $(a|b)^*a^k$ has $k+1$ states and its corresponding minimal DFA has $O(2^k)$ states, while the reverse of the NFA is deterministic, so running Brzozowski's algorithm on this reversed NFA triggers the worst-case behavior.
However, if the subset construction yields an automata of size $|T'|=O(|T|)$, then its running time is also $O(|T|^2 \cdot |Q|^2)$, which is often the case on real-life inputs. Furthermore, if proper care is taken when computing the closure of a state, then this can be done much faster in most cases (that is, cases where the closure is small), saving a factor $|T|$ in practice (for essentially the same reason that transitive closures can be computed quite quickly on real-world examples). Furthermore, if the input and intermediate automatons are sparse, which means that states have few transitions, then a factor $|Q|$ is saved, which gives a $O(|T| \cdot |Q|)$ running time on 'good' inputs.
Unfortunately, I'm not familiar enough with Hopcroft's or Moore's algorithms to give an analysis of their running times in typical cases. Wikipedia talks about a $O(|T| \log \log |T|)$ running time in some cases, which would make the three algorithms comparable.
Context
StackExchange Computer Science Q#1872, answer score: 6
Revisions (0)
No revisions yet.