snippetModerate
How to determine minimum word length of regular language
Viewed 0 times
lengthregularminimumlanguageworddeterminehow
Problem
Given a regular language $L$ and a regular expression $r$ with $L=L(r)$. Is it possible to determine the minimum length of words of $L(r)$ by the structure of $r$?
A straightforward example:
Let's say we have a regular expression $r=aac^aa$, then $L(R) = \{aaaa, aacaa, aaccaa, \dots, aac^naa\}$. To determine the minimal length I would erase everything that is postfixed with $$, leaving $r'=aaaa$. Now I would count the concatenations and add 1, which would yield in this example, not unsurprisingly, a minimum length of 4.
Is there a general approach to do this for more complex expressions?
Sidenote: I need to achieve this without the help of automata.
A straightforward example:
Let's say we have a regular expression $r=aac^aa$, then $L(R) = \{aaaa, aacaa, aaccaa, \dots, aac^naa\}$. To determine the minimal length I would erase everything that is postfixed with $$, leaving $r'=aaaa$. Now I would count the concatenations and add 1, which would yield in this example, not unsurprisingly, a minimum length of 4.
Is there a general approach to do this for more complex expressions?
Sidenote: I need to achieve this without the help of automata.
Solution
First, notice that you can easily eliminate $\emptyset$ for all regular expressions other than a regular expression describing the empty language. To do this, you use the following rewriting rules, which define an operator $E$ on regular expressions:
You can prove inductively that $E[r]$ either doesn't contain $\emptyset$, or is equal to $\emptyset$.
Applying these rewriting rules, we have either determined that the denotation of the regular expression is empty, or are given a regular expression without $\emptyset$. Now we define an operator $m$ which determines the length of the minimal word in an $\emptyset$-free regular expression:
You can also implement both operators at once, by allowing $m$ to output $\infty$ (meaning that the language defined by the regular expression is empty):
- $E[\sigma] = \sigma$, $E[\epsilon] = \epsilon$, $E[\emptyset] = \emptyset$.
- $E[r_1 r_2]$ is $\emptyset$ if one of $E[r_1],E[r_2]$ is $\emptyset$, and $E[r_1]E[r_2]$ otherwise.
- $E[r_1 + r_2]$ is $E[r_1]$ if $E[r_2] = \emptyset$, $E[r_2]$ if $E[r_1] = \emptyset$, and $E[r_1] + E[r_2]$ otherwise.
- $E[r^] = \epsilon$ if $E[r] = \emptyset$, and $E[r]^$ otherwise.
You can prove inductively that $E[r]$ either doesn't contain $\emptyset$, or is equal to $\emptyset$.
Applying these rewriting rules, we have either determined that the denotation of the regular expression is empty, or are given a regular expression without $\emptyset$. Now we define an operator $m$ which determines the length of the minimal word in an $\emptyset$-free regular expression:
- $m(\sigma) = 1$, $m(\epsilon) = 0$.
- $m(r_1r_2) = m(r_1) + m(r_2)$.
- $m(r_1 + r_2) = \min(m(r_1),m(r_2))$.
- $m(r^*) = 0$.
You can also implement both operators at once, by allowing $m$ to output $\infty$ (meaning that the language defined by the regular expression is empty):
- $m(\sigma) = 1$, $m(\epsilon) = 0$, $m(\emptyset) = \infty$.
- $m(r_1r_2) = m(r_1)+m(r_2)$, where $\infty + \ell = \ell + \infty = \infty$.
- $m(r_1+r_2) = \min(m(r_1),m(r_2))$, where $\min(\infty,\ell) = \min(\ell,\infty) = \ell$.
- $m(r^*) = 0$.
Context
StackExchange Computer Science Q#117091, answer score: 15
Revisions (0)
No revisions yet.