patternMinor
$\{uuv\mid u\in\Sigma^+, v\in \Sigma^*\}$ and pumping lemma
Viewed 0 times
miduuvsigmapumpinglemmaand
Problem
As I am currently teaching regular languages and pumping lemma, I was searching for nice examples of languages, regular or not, for exercises.
Things became a bit complicated with $L_4 = \{uuv\mid u\in \Sigma^+, v\in \Sigma^*\}$: after struggling a bit to find a counter-example word to the pumping lemma, I proved $L_4$ to be non regular if $|\Sigma|\geqslant 2$, using the fact that $L_3 = L_4^R$ and that regular languages are closed under mirror image.
However, if $|\Sigma| = 2$, I found out that $L_4$ verifies the conclusion of the pumping lemma, with a pumping length of $5$! Indeed, since any word of length $\geqslant 4$ contains a square, any word $u\in L_4$ with $|u| \geqslant 5$ can be written $u = vwwu'$ with $|vww|\leqslant 5$ and $|v|,|w| > 0$. Therefore $u = xyz$, with $x = \varepsilon$, $y = v$ and $z = wwu'$ verifying:
I then tried to find a counter-example to the pumping lemma for $L_4$ when $|\Sigma|>2$, but without success. I found out that in order to not be able to apply the previous reasoning, in any counter-example, the first $n$ (pumping length) letters need to not contain a square, which is possible, but coul
- $L_1 = \{vv\mid v\in \Sigma^*\}$ is a classic example, as it can be proven to be non regular for $|\Sigma|\geqslant 2$ using the pumping lemma;
- $L_2 = \{ uvv\mid u,v\in\Sigma^\}$ looks like $L_1$, however it is a bit of a trap: since $L_2 = \Sigma^$, it is regular;
- $L_3 = \{uvv\mid u\in\Sigma^*, v\in\Sigma^+\}$ looks like $L_2$, however it is not regular if $|\Sigma| \geqslant 2$. This can be proven using the pumping lemma: suppose it is regular, and let $n$ be the pumping length. Let $u=a^nba^nb\in L_3$ and $u = xyz$ a decomposition such that $|xy|\leqslant n$ and $|y| > 0$. Then $xz = a^kba^nb$ with $k
Things became a bit complicated with $L_4 = \{uuv\mid u\in \Sigma^+, v\in \Sigma^*\}$: after struggling a bit to find a counter-example word to the pumping lemma, I proved $L_4$ to be non regular if $|\Sigma|\geqslant 2$, using the fact that $L_3 = L_4^R$ and that regular languages are closed under mirror image.
However, if $|\Sigma| = 2$, I found out that $L_4$ verifies the conclusion of the pumping lemma, with a pumping length of $5$! Indeed, since any word of length $\geqslant 4$ contains a square, any word $u\in L_4$ with $|u| \geqslant 5$ can be written $u = vwwu'$ with $|vww|\leqslant 5$ and $|v|,|w| > 0$. Therefore $u = xyz$, with $x = \varepsilon$, $y = v$ and $z = wwu'$ verifying:
- $|xy| = |v|
- $|y| = |v| > 0$;
- for all $k \geqslant 0$, $xy^kz\in L_4$. Indeed, $xz = z = wwu' \in L_4$ (because $|w| > 0$), $xyz = u \in L_4$ (hypothesis) and if $k>1$, $xy^kz = yyy^{k-2}z \in L_4$ (because $|y| > 0$).
I then tried to find a counter-example to the pumping lemma for $L_4$ when $|\Sigma|>2$, but without success. I found out that in order to not be able to apply the previous reasoning, in any counter-example, the first $n$ (pumping length) letters need to not contain a square, which is possible, but coul
Solution
When the alphabet is of size $4$ or more, it is easy to prove that $L_4$ is not regular using the pumping lemma. Given a pumping length $n$, let $s$ be a square-free word over $\{0,1,2\}$ of length $n$; such a word exists since there is an infinite square-free word over $\{0,1,2\}$. We choose $w = 3s33s3 \in L_4$. Suppose that we could write $v = xyz$ so that $|xy| \leq n$, $y \neq \epsilon$, and $xy^iz \in L_4$ for all $i \geq 0$. In particular, $xz \in L_4$. We consider two cases:
-
$x \neq \epsilon$. In this case, $xz = 3t 33s3$, where $t$ is a word over $\{0,1,2\}$ satisfying $0
-
$x = \epsilon$. In this case, $xz = t33s3$, where $t$ is a non-empty suffix of $s$. Since $xz$ contains only three $3$s, if $xz$ starts with a square $u^2$, then either $u$ contains no $3$ or it contains a single $3$. It $u$ contains no $3$ then $u^2$ is a prefix of $t$, which is impossible since $s$ is square-free. If $u$ contains a single $3$ then $u = t3$, which is impossible since $t$ doesn't start with a $3$ whereas the second occurrence of $u$ does. Hence this case is also impossible.
In the ternary case, it suffices to find a collection or arbitrarily long square-free ternary words $w$ such that for any decomposition $w = xyz$ with $y \neq \epsilon$, we have $xy^i z \notin L_4$ for some $i$.
I conjecture that prefixes of the Leech word satisfy this condition (even if we restrict $i$ to be either $0$ or $2$).
Suppose that arbitrarily long words exist satisfying the above constraints.
Given a pumping constant $n$, choose such a word $w_k$ of length at least $n+2$ which starts and ends with the same letter $\sigma$, and let $w = w_k \sigma w_k \sigma \in L_4$. (Note that since each word satisfying the constraints is square-free, every subword of length 5 contains all different letters; hence we can always truncate a long enough word to one which ends with the same letter with which it starts.)
Suppose that $w = xyz'$, where $y \neq \epsilon$ and $|xy| \leq n \leq |w_k| - 2$. Then we can write $z' = z\sigma w_k \sigma$, where $w_k = xyz$; note that $|z| \geq 2$ and $z$ ends with $\sigma$.
According to the conjecture above, we can find $i \neq 1$ such that $xy^i z$ does not have a square prefix. If $xy^i z'$ does have a square prefix $u^2$, then $u^2$ must reach the first $\sigma$; in particular, $|u| \geq 2$.
If the first copy of $u$ doesn't reach the first $\sigma$ then then $u$ doesn't contain $\sigma^2$, but then its second copy contains both $\sigma$ and either the last letter of $z$ or the first letter of $w_k$, and so does contain $\sigma^2$. Thus the first copy of $u$ does reach the first $\sigma$.
If the first copy of $u$ doesn't end at the first $\sigma$ then it contains $\sigma^3$. Since this is the unique copy of $\sigma^3$ in $w$, this is impossible. Hence $u = xy^iz\sigma$, and so $u$ is a prefix of $w_k \sigma$. In particular, $i = 0$, since otherwise $|u| > |w_k \sigma|$. But then $u$ is a proper prefix of $w_k \sigma$, implying that $w_k$ contains a copy of $\sigma^2$, which is false.
-
$x \neq \epsilon$. In this case, $xz = 3t 33s3$, where $t$ is a word over $\{0,1,2\}$ satisfying $0
-
$x = \epsilon$. In this case, $xz = t33s3$, where $t$ is a non-empty suffix of $s$. Since $xz$ contains only three $3$s, if $xz$ starts with a square $u^2$, then either $u$ contains no $3$ or it contains a single $3$. It $u$ contains no $3$ then $u^2$ is a prefix of $t$, which is impossible since $s$ is square-free. If $u$ contains a single $3$ then $u = t3$, which is impossible since $t$ doesn't start with a $3$ whereas the second occurrence of $u$ does. Hence this case is also impossible.
In the ternary case, it suffices to find a collection or arbitrarily long square-free ternary words $w$ such that for any decomposition $w = xyz$ with $y \neq \epsilon$, we have $xy^i z \notin L_4$ for some $i$.
I conjecture that prefixes of the Leech word satisfy this condition (even if we restrict $i$ to be either $0$ or $2$).
Suppose that arbitrarily long words exist satisfying the above constraints.
Given a pumping constant $n$, choose such a word $w_k$ of length at least $n+2$ which starts and ends with the same letter $\sigma$, and let $w = w_k \sigma w_k \sigma \in L_4$. (Note that since each word satisfying the constraints is square-free, every subword of length 5 contains all different letters; hence we can always truncate a long enough word to one which ends with the same letter with which it starts.)
Suppose that $w = xyz'$, where $y \neq \epsilon$ and $|xy| \leq n \leq |w_k| - 2$. Then we can write $z' = z\sigma w_k \sigma$, where $w_k = xyz$; note that $|z| \geq 2$ and $z$ ends with $\sigma$.
According to the conjecture above, we can find $i \neq 1$ such that $xy^i z$ does not have a square prefix. If $xy^i z'$ does have a square prefix $u^2$, then $u^2$ must reach the first $\sigma$; in particular, $|u| \geq 2$.
If the first copy of $u$ doesn't reach the first $\sigma$ then then $u$ doesn't contain $\sigma^2$, but then its second copy contains both $\sigma$ and either the last letter of $z$ or the first letter of $w_k$, and so does contain $\sigma^2$. Thus the first copy of $u$ does reach the first $\sigma$.
If the first copy of $u$ doesn't end at the first $\sigma$ then it contains $\sigma^3$. Since this is the unique copy of $\sigma^3$ in $w$, this is impossible. Hence $u = xy^iz\sigma$, and so $u$ is a prefix of $w_k \sigma$. In particular, $i = 0$, since otherwise $|u| > |w_k \sigma|$. But then $u$ is a proper prefix of $w_k \sigma$, implying that $w_k$ contains a copy of $\sigma^2$, which is false.
Context
StackExchange Computer Science Q#148221, answer score: 4
Revisions (0)
No revisions yet.