patternMinor

Proving that the scramble of a regular language is context-free

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

provingthefreeregularscramblelanguagethatcontext

Problem

For strings $w$ and $t$, if they have the same length and comprise the same characters (namely, they are two permutations of these characters), then $w\sim t$. For a string $w$, define an operator $\operatorname {SCRAMBLE}$ such that
$$\operatorname {SCRAMBLE}(w):=\{t\mid t\sim w\}, $$
and for a language $A$,
$$\operatorname {SCRAMBLE}(A):=\{t\mid t\sim w, \exists w\in A\}.$$
Prove that: given an alphabet $\Sigma:=\{0,1\}$, for any regular language $L$, $\operatorname {SCRAMBLE}(L)$ is a context-free language.

I have tried to use the pumping lemma of regular language to show that it satisfies the pumping lemma of context-free language after the $\operatorname {SCRAMBLE}$ operation, but I failed to find the pumping substring of $\operatorname {SCRAMBLE}(L)$.
I have also tried PDA construction but to no avail. Can anyone help me with it? Any answer or hint will be appreciated. Thanks in advance.

Solution

First, two remarks. When $L = (01)^$, we have $\newcommand{\scramble}{\mathrm{SCRAMBLE}}\scramble(L) \cap 0^1^* = \{ 0^n 1^n : n \geq 0 \}$, and this shows that we can't expect $\scramble(L)$ to be regular.

Second, when $L=(012)^$, we have $\scramble(L) \cap 0^1^2^ = \{ 0^n 1^n 2^n : n \geq 0 \}$. This shows that it's important that the alphabet have only two symbols.

Now for the proof. It is actually enough to assume that $L$ is context-free (and not necessarily regular). For a word $w$, let $p(w) = (\#_0(w),\#_1(w))$, and define $P(L) = \{ p(w) : w \in L \}$. Parikh's theorem states that $P(L)$ is a semilinear set, that is, a finite union of sets of the form
$$
x + \sum_{i=1}^m \mathbb{N} y_i,
$$
where $x,y_1,\ldots,y_m \in \mathbb{N}^2$. It is thus enough to show that for every linear set $S$, the language $L(S)$ of words $w$ such that $p(w)$ belongs to $S$ is context-free.

A pushdown automaton accepting $L(S)$ proceeds as follows. We will assume for simplicity that $x = (0,0)$; for general $x = (x_0,x_1)$, the PDA will ignore the first $x_0$ zeroes and $x_1$ ones, but will only accept if these actually appear.

When the PDA starts running, it guesses a mode $i \in \{1,\ldots,m\}$. When it reads $0$, it pushes it to the stack. Meanwhile, it keeps track of the number of $1$s it has read. Once it has read $(y_i)_1$ of them, it removes $(y_i)_0$ zeroes from the stack (pushing, if necessary, "negative" $0$s which will get cancelled later with actual $0$s). Then it guesses another mode $i \in \{1,\ldots,m\}$ (possibly the same one).

Upon reaching the end of input, the automaton verifies that it hasn't read any $1$s in its current phase, and that the stack is empty. This completes the construction. We leave it to the reader to prove that it actually works.

Context

StackExchange Computer Science Q#55507, answer score: 9

Revisions (0)

No revisions yet.