patternMinor

Showing that the entropy of i.i.d. random variables is the sum of entropies

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

entropy information-theory cs stackoverflow

randomtheshowingentropythatvariablessumentropies

Problem

The shannon entropy of a random variable $Y$ (with possible outcomes $\Sigma=\{\sigma_{1},...,\sigma_{k}\}$) is given by

$H(Y)=-\sum\limits_{i=1}^{k}P(Y=\sigma_{i})\;\log(P(Y=\sigma_{i}))$.

For a second random variable $X=X_{1}X_{2}...X_{n}$, where all $X_{i}$'s are independent and equally distributed (each $X_{i}$ is a copy of the same random variable $Y$), the following equation is known to be true:

$H(X)=n\cdot H(Y)$

I want to prove this simple equation, where the outcomes from $Y$ are interpreted as symbols from an alphabet $\Sigma$ and therefore $X$ is the random variable for strings of length $n$ (based on the distribution of $Y$).

It is easy to see, that
$P(X=w)=P(X=w_{1}...w_{n})=P(X_{1}=w_{1})\;\cdot\;...\;\cdot\; P(X_{n}=w_{n})=\prod\limits_{i=1}^{n}P(Y=w_{i})$

... but my approach to prove $H(X)=n\cdot H(Y)$ seems to be in a dead point

(every word $w$ has the form $w=w_{1}...w_{n}$ with $w_{i}\in\Sigma$ and let $\large |w|_{\sigma_{i}}$ be the number of occurrences of $\sigma_{i}$ in $w$):

$H(X)=-\sum\limits_{w\in\Sigma^{n}}P(X=w)\;\log(P(X=w))$

$=-\sum\limits_{w\in\Sigma^{n}}\left(\prod\limits_{i=1}^{n}P(Y=w_{i})\right)\;\log\left(\prod\limits_{i=1}^{n}P(Y=w_{i})\right)$

$=-\sum\limits_{w\in\Sigma^{n}}\left(\prod\limits_{i=1}^{n}P(Y=w_{i})\right)\left(\sum\limits_{i=1}^{n}\log\left(P(Y=w_{i})\right)\right)$

$=-\sum\limits_{w\in\Sigma^{n}}\left(\prod\limits_{i=1}^{k}P(Y=\sigma_{i})^{\large |w|_{\sigma_{i}}}\right)\left(\sum\limits_{i=1}^{k}\large |w|_{\sigma_{i}}\normalsize \;\log\left(P(Y=\sigma_{i})\right)\right)$

So, I am able to change some indices from word length to the length of the alphabet, which is used in $H(Y)$. But what now? Any help?

Solution

Your attempt at the proof does not take into account any natural order on $\Sigma^n$. While it may still work, it seems difficult.

A much easier proof can be given by induction over $n$. The base case is trivial ($H(X)=H(Y)$) for $X=Y$.
Then, denote $X=X_1\cdots X_n$ and $X'=X_1\cdots X_{n-1}$, so you have

$$H(X=w)=-\sum_{w\in\Sigma^{n-1}}\sum_{\sigma\in \Sigma}Pr(X=w\sigma)\log Pr(X=w\sigma)$$
$$=-\sum_{w\in\Sigma^{n-1}}\sum_{\sigma\in \Sigma}Pr(X'=w)Pr(Y=\sigma)\log Pr(X'=w)Pr(Y=\sigma)=$$
$$=-\sum_{w\in\Sigma^{n-1}}Pr(X'=w)\sum_{\sigma\in \Sigma}Pr(Y=\sigma)(\log Pr(X'=w)+\log Pr(Y=\sigma))=$$
$$=-\sum_{w\in\Sigma^{n-1}}Pr(X'=w)\left(\sum_{\sigma\in \Sigma}Pr(Y=\sigma)\log Pr(X'=w)+\sum_{\sigma\in \Sigma}Pr(Y=\sigma)\log Pr(Y=\sigma)\right)=$$
$$=-\sum_{w\in\Sigma^{n-1}}Pr(X'=w)\log Pr(X'=w)\sum_{\sigma\in \Sigma}Pr(Y=\sigma)+\sum_{w\in \Sigma^{n-1}}Pr(X'=w)\sum_{\sigma\in \Sigma}Pr(Y=\sigma)\log Pr(Y=\sigma)=$$
$$=H(X')+H(Y)=(n-1)H(Y)+H(Y)=nH(Y)$$

Context

StackExchange Computer Science Q#21525, answer score: 4

Revisions (0)

No revisions yet.