patternMinor
Show that some context free languages must contain more that one non-terminal
Viewed 0 times
showterminalmustfreelanguagescontainnonmoreonethat
Problem
Context free languages that has only one non-terminal is a proper subset of context free languages and they does not contains regular set. Since, CFL is more powerful than FSM and contains regular set, there must be some languages that requires more than one non-terminal. How can I show or proof this formally?
Solution
Consider the language $L=\{a^mb^{m+n}a^n\ |\ m,n\ge 1\}$. From the assumption that there is a single-nonterminal CFG $G$ for $L$, we can derive constraints which eventually lead to a contradiction.
$G$ is given by the (singleton) set $\{S\}$ of nonterminals, set $\{a,b\}$ of terminals, and some number of rules $S\to w_i$ for $w_i\in\{a,b,S\}^*$. There are some obvious conditions:
Note that some of $i,j,k$ might coincide, but this does not matter. Now suppose that there is some $l$ such that $w_l$ contains more than one $S$, i.e. $w_l=xSySz$ for some $x,y,z$. Then
$S\to xSySz\to xxSySzySz\to\dots\to xxw_jyw_izyw_jz$ which contains a $b\dots a\dots b$ subsequence, i.e. we get a word outside $L$ (once any additional $S$ which might occur inside $x,y,z$ are replaced with $w_k$).
So the rules in $G$ really can only be of the forms $S\to u_iSv_i$ or $S\to w_i$ with $u_i,v_i,w_i\in\{a,b\}^*$. Since we can only use one of the latter form when building a word, and we may need arbitrarily many $b$, we get
Suppose that there is also a $j$ such that $v_j$ contains $a$; then $S\to u_iSv_i\to u_ju_iSv_iv_j\to u_iu_ju_iSv_iv_jv_i$, which again contains a $b\dots a\dots b$ subsequence and takes us out of $L$. Therefore
But the words in $L$ all end with an $a$, so
But then $G$ is a regular grammar and cannot generate a non-regular language like $L$.
$G$ is given by the (singleton) set $\{S\}$ of nonterminals, set $\{a,b\}$ of terminals, and some number of rules $S\to w_i$ for $w_i\in\{a,b,S\}^*$. There are some obvious conditions:
- There must be some $i$ such that $w_i$ contains $a$.
- There must be some $j$ such that $w_j$ contains $b$.
- There must be some $k$ with $w_k\in\{a,b\}^*$.
Note that some of $i,j,k$ might coincide, but this does not matter. Now suppose that there is some $l$ such that $w_l$ contains more than one $S$, i.e. $w_l=xSySz$ for some $x,y,z$. Then
$S\to xSySz\to xxSySzySz\to\dots\to xxw_jyw_izyw_jz$ which contains a $b\dots a\dots b$ subsequence, i.e. we get a word outside $L$ (once any additional $S$ which might occur inside $x,y,z$ are replaced with $w_k$).
So the rules in $G$ really can only be of the forms $S\to u_iSv_i$ or $S\to w_i$ with $u_i,v_i,w_i\in\{a,b\}^*$. Since we can only use one of the latter form when building a word, and we may need arbitrarily many $b$, we get
- For some $i$, either $u_i$ or $v_i$ contains a $b$. We will assume wlog that it is $v_i$; the $u_i$ case is symmetric.
Suppose that there is also a $j$ such that $v_j$ contains $a$; then $S\to u_iSv_i\to u_ju_iSv_iv_j\to u_iu_ju_iSv_iv_jv_i$, which again contains a $b\dots a\dots b$ subsequence and takes us out of $L$. Therefore
- $v_i\in b^*$ for all $i$.
But the words in $L$ all end with an $a$, so
- $v_i=\epsilon$ for all $i$.
But then $G$ is a regular grammar and cannot generate a non-regular language like $L$.
Context
StackExchange Computer Science Q#51238, answer score: 4
Revisions (0)
No revisions yet.