HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Counting the number of words accepted by an acyclic NFA

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
numberthecountingacyclicnfawordsaccepted

Problem

Let $M$ be an acyclic NFA.

Since $M$ is acyclic, $L(M)$ is finite.

Can we compute $|L(M)|$ in polynomial time?

If not, can we approximate it?

Note that the number of words is not the same as the number of accepting paths in $M$, which is easily computable.

Let me mention one obvious approach that doesn't work: convert the NFA to a DFA (which will also be acyclic), then count the number of accepting paths in the DFA. This doesn't result in a polynomial-time algorithm, since the conversion can cause an exponential blowup in the size of the DFA.

Solution

Here's one approach that I expect should give you a multiplicative-factor approximation, with polynomial time running time.

Let $L$ be a regular language that is a subset of $\{0,1\}^n$, e.g., $L=L(M) \cap \{0,1\}^n$. We will try to compute the approximate size of $L$.

At a high level, our approach to approximate $|L|$ will look something like this:

-
Pick a fraction $p$, where $0

-
Choose a regular language $R$ such that, roughly speaking, $R$ is a random subset of $\{0,1\}^n$ of size approximately $p2^n$ (i.e., $|R|\approx p2^n$).

-
Check whether $L \cap R$ is non-empty. Note that this check can be done in polynomial time.

Repeatedly perform steps 1-3 for various values of $p$. This gives you some information that will let you approximate $|L|$.

In particular, if $|L|=m$, then we'd expect

$$\Pr[L \cap R = \emptyset] = (1-p)^m \approx e^{-pm}.$$

So, if you happen to choose $p=1/m$ and repeat steps 1-3 a bunch of times, you should expect to see an empty intersection about 37% of the time. If you see an empty intersection significantly more often then that, then increase $p$ and try again. If you see an empty intersection significantly less often then that, you might decrease $p$ and try again.

In this way, using something like binary search, you should be able to approximate $|L|$ to within a multiplicative approximation factor.

You'll still need to pick some way to choose $R$ so that it is regular but also behaves like a random subset. There are many possibilities, but one might good way might be to choose a random 2-universal hash $h:\{0,1\}^m \to \{0,1,2,\dots,k-1\}$, pick $y \in \{0,1,\dots,k-1\}$ randomly, and let $R=\{x \in \{0,1\}^n : h(x)=y\}$. Choosing $k=\lceil 1/p \rceil$ gives you a random set $R$ of approximately the right size, and because $h$ is 2-universal, all of the mathematics above should work out properly.

This should solve your problem for the case where all strings in the NFA have the same length, say $n$. If they have varying lengths, then you can handle each possible length separately. Since $M$ is acyclic, the maximum lengths of any string in $L(M){}$ is at most the number of states in $M$, so this does not increase the runtime too much.

(This construction might remind you of the Valiant-Vazirani theorem about unambiguous SAT.)

Context

StackExchange Computer Science Q#28484, answer score: 3

Revisions (0)

No revisions yet.