## The Pumping Lemma for Regular Languages

Pumping Lemma is a powerful tool to prove that a language is not regular. The proof technique used here is **Proof by Contradiction**. ie., initially, in the proof, the language is considered as a regular language. The following table shows how and where pumping lemma used exactly:

The finite automata is simply a (virtual) machine and has no additional memory. It does not keep track of how many input symbols read. It just checks for whether the input string empties when reaching its final states.

**Theorem:** Let L be a regular language. Then there exist a constant n (Whose value depends on L) such that for every string w belonging to the language L and |w|≥ n, w can be broken into three parts x,y, and z such that w=xyz, with following conditions –

1. |xy| ≤ n

2. y > 0 i.e. y *≠ ∈* and

3. for all i ≥ 0, the string xy^{i}z ∈ L

Illustration of pumping lemma as a GAME :

**Qus 1: Show that L = { a**^{n}b^{n} | n ≥ 1} is not regular.

^{n}b

^{n}| n ≥ 1} is not regular.

1) Let us consider L is regular. Thus it has a finite automata to accept it. Let us consider *k *be the number of states. Also, if the finite automata exists, then it has at least 1 state (i.e., k≥1).

2) Let us choose a string w = a^{k }b^{k}

3) The w is splitted into three parts in such a way that |xy| ≤ k and |y| > 0 i.e. y is not equal to *∈*

|string| ⇒ mod operator denotes the length of the string.

*xy *(prefix part) contains entirely some ‘a’s. (because it has a length lesser than or equal to k).

So, let us have some assumptions like this –

x = a^{t}, t<k

y = a^{p}, p < k and (t+p) ≤ k

z = a^{k}^{–}^{(}^{t+p}^{) }b^{k}

substituting these in w = xy^{i}z = (a^{t}) (a^{p})^{i }(a^{k-(t+}^{p}^{)}b^{k})

4) putting i = 0, we get w = a^{k-p }b^{k} actually this does not belongs to L. (there should be equal number of a’s and b’s). So, the L is not regular.

**Qus 2: Show that L = {a**^{n}|n is a prime number} is not Regular.

^{n}|n is a prime number} is not Regular.

Let L be a regular language and M be its corresponding Finite Automata with m number of states. Let w be a string which belongs to regular language L with |w|=n where n is a prime number such that n ≥ m. Hence, w can be decomposed as w=xyz such that |xy| ≤ n and |y| > 0.

Note: Since w ∈ L with |w|=n and w= xyz then we get |xyz| = n ( prime number). Means if the length of string which belongs to L is a prime no then it is a regular language^{ }

Since w = xyz ∈ L, then for L to be regular xy^{i}z should also belong to L for every value of i. So we can say that the length of xy^{i}z = |xy^{i}z| should be a prime number.

|xy^{i}z| = |xyz| + (i-1)*|y|

let |y| = k where k > 0 and i = n+1 then

|xy^{i}z| = |xyz| + (i-1)*|y| = n + ( n+1-1)*k = n+n*k = n*(1+k) => which is composite for i=p+1. Hence, xy^{i}z not belongs to L for all values of i. Therefore, L is not a regular language.

## The Pumping Lemma for Context-free Languages

Now, we will study the pumping lemma for context-free languages, which can be used to show that a given languages is not context-free. We will use the pumping lemma to prove that L is not context-free. Based on this result, we can also prove that the context-free languages are not closed under intersection, complementation or set-difference.

**Theorem:** For all context-free languages L, there is a positive integer n such that, for all string z ∈ L if |z| ≥ n, then z can be decomposed as z = uvwxy where u, v, w, x, y are strings such that,

(1) |vwx| ≤ n

(2) |vx|>0 and

(3) uv^{i}wx^{i}y ∈ L, for all i ≥ 0.

* Question:* Is language

**L = {o**≥ 1

^{n}1^{n}2^{n}|n**}**context-free? i.e., is there a grammar that generates

*L*

*?*

** Answer:** No. Intuitively, although it’s easy to keep the 0’s and 1’s matched, or to keep the 1’s and 2’s matched, or to keep the 0’s and 2’s matched, there is no way to keep all three symbols matched simultaneously.

### Qus 1: Show that L = {o^{n}1^{n}2^{n}|n ≥ 1 } is not Context-free.

let’s see how pumping lemma can be used to show that L = { 0^{n}1^{n}2^{n} | n ≥ 1 } is not context-free. Suppose that L is context-free. Thus there is a positive integer n with the property of the lemma. Let z = 0^{n}1^{n}2^{n}. Since z ∈ L and |z| = 3n ≥ n, we have u, v, w, x, y are strings such that z = uvwxy and

(1) |vwx| ≤ n

(2) |vx|>0 and

(3) uv^{i}wx^{i}y ∈ L, for all i ≥ 0.

Since 0^{n}1^{n}2^{n} = z = uvwxy, (1) tells us that vwx doesn’t contain both a 0 and a 2 because in that case |vwx|>n [ If vwx contain all the 3 symbols 0, 1, 2 then minimum string is 01^{n}2 for which |vwx|will become grater than n, but it should be ≤ n] . Thus, either vwx has no 0‘s, or vwx has no 2′s, so that there are two cases to be considered.

Case 1: Suppose vwx has no 0‘s. Then vx also has no 0‘s.

By (2), we have vx that contains atleast a 1 or a 2

Thus uwy has n 0‘s and either has less than n l’s or has less than n 2’s.

But (3) tells us that for i=0, uv^{i}wx^{i}y = uv^{0}wx^{0}y = uwy ∈ L. So if uwy ∈ L then it should contain equal number of 0‘s, l‘s and 2‘s, which is contradiction as stated above that uwy contain n 0‘s and either has less than n l’s or has less than n 2’s. hence L is not a CFL.

Case 2: This case where vwx has no 2‘s is similar to case1.

Since we obtained a contradiction in both cases, we have an overall contradiction. Thus L is not context-free.

**Qus 2: Show that L = {a**^{n}|n is a prime number} is not Context-free.

^{n}|n is a prime number} is not Context-free.

Suppose that L be a context-free language and M be its corresponding Finite Automata with m number of states. Let w be a string which belongs to context-free language L with |w|=n where n is a prime number such that n ≥ m. Hence, w can be decomposed as w=uvwxy such that |vwx| ≤ n and |vx| > 0.

Note: Since w ∈ L with |w|=n and w= uvwxy therefore we can get |uvwxy| = n ( prime number). Means we may say that if length of a^{n} is prime no then it is a regular language^{ }

Now Since w = uvwxy ∈ L, then for L to be context-free uv^{i}wx^{i}y should also belong to L for every value of i. Then we can say that the length of uv^{i}wx^{i}y = |uv^{i}wx^{i}y| should be a prime number.

|uv^{i}wx^{i}y| = |uvwxy| + (i-1)*|vx|

let |vx| = k where k > 0 and i = n+1 then

|uv^{i}wx^{i}y| = |uv^{i}wx^{i}y| + (i-1)*|vx| = n + ( n+1-1)*k = n+n*k = n*(1+k) => which is composite for i=p+1. Hence, uv^{i}wx^{i}y not belongs to L for all values of i. Therefore, L is not a context-free language.

TextBook of Theory of Computation: