HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Equivalance of NFAs, understanding the solution

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
understandingtheequivalancesolutionnfas

Problem

This solution comes from https://people.eecs.berkeley.edu/~luca/cs172-07/solutions/practicefinal-sol.pdf


Consider the language


$$ EQ_{NFA} = \{ \langle N, N^\prime \rangle \mid N, N^\prime \text{ are NFA's with the same alphabet and } L(N) = L(N^\prime) \} $$ show
that $EQ_{NFA} \in \textbf{PSPACE}$.


(hint: Can you convert this to a appropriate reachability problem?)


Solution Outline: Suppose $N$ and $N^\prime$ both have at most $n$ states. We can then convert them into DFAs $D_N$ and
$D_{N^\prime}$ with at most $m = 2n$ states each using space
polynomial in $n$. Finally, we can construct a DFA $S$, which is the
product of $D_N$ and $D_{N^\prime}$ (with at most $m = 2^n$ states)
and accepts $L(D_N)\Delta L(D_{N^\prime})$ (strings that are in
exactly one of the languages). Now, $L(N) = L(N^\prime)$ iff $L(S) = \emptyset$ i.e. none of the final states are reachable from the start
state in $S$.


Since this is a reachability problem, it can be decided
nondeterministically using space logarithmic in the size of the graph
(because $PATH \in NL$). Thus, this problem can be decided in
$NSPACE(\log{(m^2)}) = NSPACE(n) \subseteq NSPACE(n^2) \subseteq PSPACE)$.

I see that it works and I see that the last paragraph try to convince that it is actually $PSPACE$. But, I have a doubt:

After all, the solution constructs two exponential automats. How is it possible?

My intuition is that:
We can compute DFA for NFA using $\log n$ memory on working tape. Obviously, we have to use exponential size of memory but on the output (readonly) tape.

Now, when we construct it we can use it in reachability problem which works in $NL$. We know that composition of $LOG$ function is $LOG$. Is my understanding correct? Does the same apply (I mean a composition of functions) when it comes to $NLOG$?

Solution

Your confusion is understandable. The solution outline you've quoted misses an important point and suggests the opposite of that point.

The DFAs are, as you say, exponentially big. We can construct them, but not in polynomial space. However, the key point is that we don't need to construct and store the whole automaton. The reachability algorithm doesn't need that. All it needs is the ability to know what state it's in at the moment and construct the list of states it could move to next. Both of these can be done easily, on the fly, from the description of the NFAs. Likewise, we don't need to literally construct the product automaton: we can do that implicitly as well, by just tracking our position in each automaton separately.

Let's do a simplified example. Suppose that we just want to know if a particular NFA $N = (Q, \Sigma, \Delta, q_0, A)$ accepts at least one string, i.e., if an accepting state is reachable from the initial state. In this simplified scenario, we could just work directly with the NFA but we'll ignore that and convert to a DFA, $M$. If $N$ has $n$ states, then $M$ will have $2^n$ states so, if we were to write out its full transition graph, we'd need exponential space.

However, the reachability algorithm doesn't require us to write out the whole graph. As it runs, it says things like, "OK, I'm at vertex $v$. Where can I go next?" Normally, we imagine that it looks up that vertex in the graph's adjacency matrix to answer this question, but that's not the only way. It could calculate what vertices can be reached, using the rules of the subset construction that generates $M$ from $N$. Specifically, if we're at vertex $v$ then $v = \{s_1, \dots, s_\ell\}\subseteq Q$ and the vertices we can move to are the vertices of the form $\Delta(s_1,a)\cup\dots\cup\Delta(s_\ell,a)$ for each symbol $a\in\Sigma$. Each such vertex is a subset of $Q$, which we can write down in $n$ bits, and there are at most $|\Sigma|$ different possibilities, so you can write all of them down in $n|\Sigma|$ bits, which is polynomial in the size of $N$. Once the reachability algorithm has decided what to do next, it can forget this list of vertices and re-use the space. If it needs to access the list of $v$'s neighbours again, it can recalculate it.

The reachability algorithm never needs the whole adjacency matrix of the graph to be written down at the same time. Pretend, for a moment, that we did have the graph written down. Since the graph has $2^n$ vertices, we would need working space $O(\log 2^n) = O(n)$ to run the algorithm. Now, instead of writing the adjacency matrix down, we'll calculate it on the fly. That will require additional working space $O(n)$ for the calculations of what the adjacency matrix is. So the total space used is still $O(n)$, even though the graph can't be stored in that little space.

By the way, this trick of calculating things on the fly is a common trick in logarithmic space computations. Another prominent example is showing that logspace reductions compose (i.e., if there are logspace reductions $A\to B\to C$ then there's a logspace reduction $A\to C$). What you'd like to do is just do the first reduction, store the result on a tape and then apply the second result to that; but you can't because the result of the first reduction might be too big to store. So, instead, you start with the second reduction. Every time you need to know some character of the output of the first reduction, you recompute it from scratch.

Context

StackExchange Computer Science Q#80149, answer score: 8

Revisions (0)

No revisions yet.