patternMinor

Regular expression for strings that begin with 0 and contain an equal number of 01 and 10 substrings

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

regular-expressions automata cs stackoverflow finite-automata regular-languages

expressionsubstringsnumberwithcontainequalregularthatforand

Problem

I'm trying to write a regular expression for the language $L\subseteq\{0,1\}^*$ of strings that begin with $0$ and contain an equal number of occurrences of the substrings $01$ and $10$. Substrings can overlap, so $010$ is in the language but, e.g., $01101$ is not.

I've worked on a DFA where I can move on to create the RE. Here is my DFA:

I'm trying to use Arden's theorem to get the regular expression but I'm stuck. This is how I'm thinking:

\begin{align*}
a_1 &= 0a_2 + e\\
a_2 &= 0a_3 + 1a_5\\
a_3 &= 0a_3 + 1a_4\\
a_4 &= 0a_3 + 1a_4\\
a_5 &= 0a_6 + 1a_5\\
a_6 &= 0a_6 + 1a_5\,.
\end{align*}
I dont describe $a_7$, the "sink state" because it only results in the empty set.

My problem here is that I will never be able to get describe the language because of how $a_3$, $a_4$ ,$a_5$, $a_6$ are? Where did I go wrong?

After some work, I came to this solution, is it correct?

$$0 + 00(11^0 + 0)^ + 01(1 + 00^1)^00^*$$

Solution

Part of your confusion stems from the fact that there are two different formulations of Arden's Lemma, which I'll call LR and RL. To illustrate them, let's take a simple FA with three states, $P,Q,R$, where $P$ is the start state and $R$ is the only final state. The transitions are
$$\begin{array}{c|cc}
& 0 & 1\\ \hline
P & Q & \varnothing\\
Q & R & Q\\
R & R & R
\end{array}$$
It's clear with a moment's thought that the language accepted by this FA is denoted by the regular expression $01^0(0+1)^$, so let's see what the two versions of Arden's Lemma produce.

LR: In this version we generate the regular expression from left to right. If we have a state $X$ with a transition on $a$ to state $Y$ we will include the equation $X=aY$, along with any other terms that arise from transitions from $X$ to other states. Along with this, we add a "$+\epsilon$" to the expression for $X$ if $X$ happens to be a final state. Having made the collection of equations, we may apply the simplification rule that says that whenever we have $X=rX + s$ for regular expressions $r, s$ we can replace the equation by its solution $X=r^*s$. Note that here the goal is to product a regular expression for the start state.

In the example above, we'll have the equations
$$\begin{align}
P&=0Q\\
Q&=0R+1Q\\
R&=(0+1)R+\epsilon & \text{since R is final}
\end{align}$$
Substituting, we see that $R=(0+1)^\epsilon = (0+1)$ and $Q=1^(0R)=1^0(0+1)^$, and finally $P=0Q=01^0(0+1)^*$, as we expected.

RL: In this version we construct the regular expression from right to left. The difference is that if we have a state $X$ with transition on $a$ to state $Y$ we include the equation $Y=Xa$, along with any similar terms. In this case, we add the "$+\epsilon$" only to the start state's equation and the substitution in this case says that if we have $X=Xr+s$ we have the solution $X=sr^*$.

Returning to our example, we'll have
$$\begin{align}
P&=\epsilon & \text{since $P$ is the start state and has no incoming transitions}\\
Q&=P0+Q1\\
R&=Q0+R(0+1)
\end{align}$$
Now our goal is just opposite of what we had in the LR version: we want an expression for the final state, $R$. Working from the top down we have $Q=Q1+P0=Q1+0$ so $Q=01^$ and $R=Q0(0+1)^=01^0(0+1)^$: the same result generated in a different order.

Assuming you want to use the LR version, your equations should be
$$\begin{align}
a_1&=0a_2\\
a_2&=0a_3+1a_5+\epsilon\\
a_3&=0a_3+1a_4+\epsilon\\
a_4&=0a_3+1a_4\\
a_5&=0a_6+1a_5\\
a_6&=0a_6+1a_5+\epsilon
\end{align}$$
As for your problem with $a_5, a_6$, for example, we could do this:
$$\begin{align}
a_5 &= 1^(0a_6)=1^0a_6 & \text{Arden}\\
a_6 &= 0^*(1a_5+\epsilon) &\text{Arden again, now substitute $a_6$}\\
a_5 &= 1^0[0^(1a_5+\epsilon)]\\
&= (1^00^1)a_5+1^00^ &\text{so we have}\\
a_5 &= (1^00^1)^1^00^*
\end{align}$$

Context

StackExchange Computer Science Q#57342, answer score: 6

Revisions (0)

No revisions yet.