HiveBrain v1.2.0
Get Started
← Back to all entries
snippetModerate

How to determine minimum word length of regular language

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
lengthregularminimumlanguageworddeterminehow

Problem

Given a regular language $L$ and a regular expression $r$ with $L=L(r)$. Is it possible to determine the minimum length of words of $L(r)$ by the structure of $r$?

A straightforward example:

Let's say we have a regular expression $r=aac^aa$, then $L(R) = \{aaaa, aacaa, aaccaa, \dots, aac^naa\}$. To determine the minimal length I would erase everything that is postfixed with $$, leaving $r'=aaaa$. Now I would count the concatenations and add 1, which would yield in this example, not unsurprisingly, a minimum length of 4.

Is there a general approach to do this for more complex expressions?

Sidenote: I need to achieve this without the help of automata.

Solution

First, notice that you can easily eliminate $\emptyset$ for all regular expressions other than a regular expression describing the empty language. To do this, you use the following rewriting rules, which define an operator $E$ on regular expressions:

  • $E[\sigma] = \sigma$, $E[\epsilon] = \epsilon$, $E[\emptyset] = \emptyset$.



  • $E[r_1 r_2]$ is $\emptyset$ if one of $E[r_1],E[r_2]$ is $\emptyset$, and $E[r_1]E[r_2]$ otherwise.



  • $E[r_1 + r_2]$ is $E[r_1]$ if $E[r_2] = \emptyset$, $E[r_2]$ if $E[r_1] = \emptyset$, and $E[r_1] + E[r_2]$ otherwise.



  • $E[r^] = \epsilon$ if $E[r] = \emptyset$, and $E[r]^$ otherwise.



You can prove inductively that $E[r]$ either doesn't contain $\emptyset$, or is equal to $\emptyset$.

Applying these rewriting rules, we have either determined that the denotation of the regular expression is empty, or are given a regular expression without $\emptyset$. Now we define an operator $m$ which determines the length of the minimal word in an $\emptyset$-free regular expression:

  • $m(\sigma) = 1$, $m(\epsilon) = 0$.



  • $m(r_1r_2) = m(r_1) + m(r_2)$.



  • $m(r_1 + r_2) = \min(m(r_1),m(r_2))$.



  • $m(r^*) = 0$.



You can also implement both operators at once, by allowing $m$ to output $\infty$ (meaning that the language defined by the regular expression is empty):

  • $m(\sigma) = 1$, $m(\epsilon) = 0$, $m(\emptyset) = \infty$.



  • $m(r_1r_2) = m(r_1)+m(r_2)$, where $\infty + \ell = \ell + \infty = \infty$.



  • $m(r_1+r_2) = \min(m(r_1),m(r_2))$, where $\min(\infty,\ell) = \min(\ell,\infty) = \ell$.



  • $m(r^*) = 0$.

Context

StackExchange Computer Science Q#117091, answer score: 15

Revisions (0)

No revisions yet.