patternMinor
Doubt in derivation of a proof in Information Theory
Viewed 0 times
theoryderivationdoubtproofinformation
Problem
In my class we were trying to derive Shannon's Source Theorem, first by proving the equivalent form in a weaker version.
The question is: Consider a biased coin with probability of heads $p \geq \frac{1}{2}$.We want to transfer this information with the code of minimum expected length.
To prove this, we can construct a code such that for any $\epsilon $, for large $n$, there exists a code such that average code length is less than $(1+\delta)nH(p) $ and also the lower limit of the code length is $(1-\delta)nH(p) $
We tried to prove it by first getting the inequalities
$$ \frac{2^{nH(p)}}{n+1}\leq{n \choose np} \leq 2^{nH(p)}$$
Both these inequalities are easy to derive. The problem comes with deriving the lower limit we try to prove that in an optimal code the length of the lower probability event is greater than equal to higher probability code, and then try to find the length of one of the events of higher probability an conclude by showing that since the lengths of events with lower probability is going to be more than this the expected length is more than $(1-\delta)nH(p) $
Hence, since $ \frac{2^{nH(p)}}{n+1}\leq{n \choose np} $, one of the events containing $np $ heads will be of length greater than $log_{2}\frac{2^{nH(p)}}{n+1}$ . That is $ nH(p) -\log_{2}(n+1) $ . However to complete this part of the proof I need to prove the probability of all events with less than $np$ heads is a constant. Is it true for a binomial distribution that for an event with probability of success greater than $c$ where $c$ is some constant independent of $p$? By proving this we can prove that the expected length of the code is greater than $ c \times (nH(p) -\log_{2}(n+1)) $ which comes up to $(1-\delta)nH(p) $
The question is: Consider a biased coin with probability of heads $p \geq \frac{1}{2}$.We want to transfer this information with the code of minimum expected length.
To prove this, we can construct a code such that for any $\epsilon $, for large $n$, there exists a code such that average code length is less than $(1+\delta)nH(p) $ and also the lower limit of the code length is $(1-\delta)nH(p) $
We tried to prove it by first getting the inequalities
$$ \frac{2^{nH(p)}}{n+1}\leq{n \choose np} \leq 2^{nH(p)}$$
Both these inequalities are easy to derive. The problem comes with deriving the lower limit we try to prove that in an optimal code the length of the lower probability event is greater than equal to higher probability code, and then try to find the length of one of the events of higher probability an conclude by showing that since the lengths of events with lower probability is going to be more than this the expected length is more than $(1-\delta)nH(p) $
Hence, since $ \frac{2^{nH(p)}}{n+1}\leq{n \choose np} $, one of the events containing $np $ heads will be of length greater than $log_{2}\frac{2^{nH(p)}}{n+1}$ . That is $ nH(p) -\log_{2}(n+1) $ . However to complete this part of the proof I need to prove the probability of all events with less than $np$ heads is a constant. Is it true for a binomial distribution that for an event with probability of success greater than $c$ where $c$ is some constant independent of $p$? By proving this we can prove that the expected length of the code is greater than $ c \times (nH(p) -\log_{2}(n+1)) $ which comes up to $(1-\delta)nH(p) $
Solution
For fixed $p$, the central limit theorem shows that
$$
\begin{align*}
\lim_{n\to\infty} \Pr[\mathrm{Bin}(n,p) \leq np] &= \lim_{n\to\infty} \Pr[\tfrac{1}{n}\mathrm{Bin}(n,p) \leq p] \\ &=
\Pr[N(p,\tfrac{p(1-p)}{n}) \leq p] \\ &= \frac{1}{2}.
\end{align*}
$$
In this particular case you can get quantitative bounds in several ways, for example using the Berry–Esseen theorem or large deviation theorems such as the Chernoff bound.
$$
\begin{align*}
\lim_{n\to\infty} \Pr[\mathrm{Bin}(n,p) \leq np] &= \lim_{n\to\infty} \Pr[\tfrac{1}{n}\mathrm{Bin}(n,p) \leq p] \\ &=
\Pr[N(p,\tfrac{p(1-p)}{n}) \leq p] \\ &= \frac{1}{2}.
\end{align*}
$$
In this particular case you can get quantitative bounds in several ways, for example using the Berry–Esseen theorem or large deviation theorems such as the Chernoff bound.
Context
StackExchange Computer Science Q#37126, answer score: 2
Revisions (0)
No revisions yet.