snippetMinor

Efficiently convert an NFA with multiple $\varepsilon$ edges and accepting states into a regular expression

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

regular-expressions automata cs stackoverflow finite-automata regular-languages

expressionefficientlyedgesnfaconvertwithstatesintoregularaccepting

Problem

Given an NFA with alphabet $\Sigma = \{a, b, c\}$ defined in the diagram, is there a way to efficiently convert it into a regular expression?

The way I solved this problem is to first convert the NFA into a DFA using equivalent classes, and then proceed with the method described here. I feel in this case, that method is very inconvenient, as there are many loops and multiple accepting states. I wrote down all the partial regexps regarding to each accepting state alone, then unioned them together, then eliminated the redundant parts. My answer is $a(a^+ \cup b^a \cup c^)^+$.

I also tried to eliminate the $\varepsilon$ edges first then start from the "sanitized" NFA, but there were way too many edges in this case, and was very confusing during the process.

Edit: as @DavidRicherby has commented below, converting the NFA to DFA first is not necessary and makes the problem more complex.

Solution

State elimination method is a good technique, if used properly.

Solving $f_2$ can be done directly from the given automaton.

$$ f_2 = ( s \rightarrow q_1 \rightarrow f_2 + s \rightarrow q_2 \rightarrow f_2 + s \rightarrow f_1 \rightarrow f_2)^+ $$
( Overall Kleene + due to $(f_2,\epsilon) \rightarrow s$)
$$ f_2 = (a\cdot a^\cdot a + a\cdot b^\cdot a + (a\cdot c^*)^+\cdot c)^+ $$
This can be rearranged as
$$ f_2 = (a\cdot a^+ + a\cdot b^\cdot a + (a\cdot c^)\cdot (a\cdot c^)^\cdot c )^+$$
$$ = ( a \cdot ( a^+ + b^\cdot a + c^\cdot (a\cdot c^)^\cdot c) )^+ $$

Similarly, $f_1$ can be solved directly from the given automaton as follows
$$ f_1 = ( s \rightarrow q_1 \rightarrow f_2 \rightarrow s + s \rightarrow q_2 \rightarrow f_2 \rightarrow s)^* \cdot s \rightarrow f_1 \cdot (\epsilon + c \cdot \epsilon \cdot f_1) $$
$$ = ( a \cdot a^+ + a\cdot b^ \cdot a )^ \cdot (a\cdot c^*)^+ \cdot ( \epsilon + c \cdot \epsilon \cdot f_1) $$

$( \epsilon + c \cdot \epsilon \cdot f_1)$ represent the two possible transitions from $f_1$. If the input has been completed scanned, stop at $f_1$. If not, move to $f_2$ using $c$, then move to $s$ using $\epsilon$. Then start scanning for $f_1$ again.

This recursive relationship can be solved by taking
$$ r = ( a \cdot a^+ + a\cdot b^ \cdot a )^ \cdot (a\cdot c^*)^+$$
Then $f_1$ becomes,

$$ f_1 = r \cdot ( \epsilon + c \cdot f_1)$$
$$ = r \cdot ( \epsilon + c \cdot ( r \cdot ( \epsilon + c \cdot f_1) ) )$$
Unravelling this recursion gives us,
$$ f_1 = r + r \cdot c \cdot r + r \cdot c \cdot r \cdot c \cdot r + \space... $$
$$ = r \cdot ( c \cdot r )^*$$

Context

StackExchange Computer Science Q#97327, answer score: 4

Revisions (0)

No revisions yet.