HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Show that some context free languages must contain more that one non-terminal

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
showterminalmustfreelanguagescontainnonmoreonethat

Problem

Context free languages that has only one non-terminal is a proper subset of context free languages and they does not contains regular set. Since, CFL is more powerful than FSM and contains regular set, there must be some languages that requires more than one non-terminal. How can I show or proof this formally?

Solution

Consider the language $L=\{a^mb^{m+n}a^n\ |\ m,n\ge 1\}$. From the assumption that there is a single-nonterminal CFG $G$ for $L$, we can derive constraints which eventually lead to a contradiction.

$G$ is given by the (singleton) set $\{S\}$ of nonterminals, set $\{a,b\}$ of terminals, and some number of rules $S\to w_i$ for $w_i\in\{a,b,S\}^*$. There are some obvious conditions:

  • There must be some $i$ such that $w_i$ contains $a$.



  • There must be some $j$ such that $w_j$ contains $b$.



  • There must be some $k$ with $w_k\in\{a,b\}^*$.



Note that some of $i,j,k$ might coincide, but this does not matter. Now suppose that there is some $l$ such that $w_l$ contains more than one $S$, i.e. $w_l=xSySz$ for some $x,y,z$. Then
$S\to xSySz\to xxSySzySz\to\dots\to xxw_jyw_izyw_jz$ which contains a $b\dots a\dots b$ subsequence, i.e. we get a word outside $L$ (once any additional $S$ which might occur inside $x,y,z$ are replaced with $w_k$).

So the rules in $G$ really can only be of the forms $S\to u_iSv_i$ or $S\to w_i$ with $u_i,v_i,w_i\in\{a,b\}^*$. Since we can only use one of the latter form when building a word, and we may need arbitrarily many $b$, we get

  • For some $i$, either $u_i$ or $v_i$ contains a $b$. We will assume wlog that it is $v_i$; the $u_i$ case is symmetric.



Suppose that there is also a $j$ such that $v_j$ contains $a$; then $S\to u_iSv_i\to u_ju_iSv_iv_j\to u_iu_ju_iSv_iv_jv_i$, which again contains a $b\dots a\dots b$ subsequence and takes us out of $L$. Therefore

  • $v_i\in b^*$ for all $i$.



But the words in $L$ all end with an $a$, so

  • $v_i=\epsilon$ for all $i$.



But then $G$ is a regular grammar and cannot generate a non-regular language like $L$.

Context

StackExchange Computer Science Q#51238, answer score: 4

Revisions (0)

No revisions yet.