patternMinor
In information theory, why is the entropy measured in units of bits?
Viewed 0 times
unitswhythetheorybitsentropymeasuredinformation
Problem
In information theory, we have the quantity "information".
Suppose we have some discrete random variable $X$, that can take values $\{{a,b,c\}}$ with corresponding probability distribution $\{{\frac{1}{2},\frac{1}{4},\frac{1}{4}\}}$. Then, information, $h$, is measured as $h(x) = -\log_{2}P(x)$.
So now we can say: $$h(a) = -\log_2 \bigg(\frac{1}{2}\bigg) = 1 \text{ bit}$$ $$h(b) = -\log_2 \bigg(\frac{1}{4}\bigg) = 2 \text{ bits}$$.
My question is, why does that computation result in a quantity measured in bits?
Explanations I've seen all say something along the lines of: "this quantity is measured in units of bits because of the log base 2" with no further explanation as to why. So why does taking the log (base 2) of a probability result in a quantity measured in units of bits?
Is it somehow related to the idea that if a system can be in $N$ states, then we can encode those $N$ states using $log_2 N$ bits? E.g. if a system can be in 4 states, we can encode those 4 states in binary with $log_2 4 = 2 \text{ bits}$: $00, 01, 10, 11$.
Suppose we have some discrete random variable $X$, that can take values $\{{a,b,c\}}$ with corresponding probability distribution $\{{\frac{1}{2},\frac{1}{4},\frac{1}{4}\}}$. Then, information, $h$, is measured as $h(x) = -\log_{2}P(x)$.
So now we can say: $$h(a) = -\log_2 \bigg(\frac{1}{2}\bigg) = 1 \text{ bit}$$ $$h(b) = -\log_2 \bigg(\frac{1}{4}\bigg) = 2 \text{ bits}$$.
My question is, why does that computation result in a quantity measured in bits?
Explanations I've seen all say something along the lines of: "this quantity is measured in units of bits because of the log base 2" with no further explanation as to why. So why does taking the log (base 2) of a probability result in a quantity measured in units of bits?
Is it somehow related to the idea that if a system can be in $N$ states, then we can encode those $N$ states using $log_2 N$ bits? E.g. if a system can be in 4 states, we can encode those 4 states in binary with $log_2 4 = 2 \text{ bits}$: $00, 01, 10, 11$.
Solution
Here is a simple intuition. Suppose we consider a random value that takes the values 0 or 1 with equal probability. Then this value can be represented in a single bit. Moreover, its Shannon entropy is $1$. Therefore, it is natural to say that its entropy is 1 bit. More generally, consider a random value that is distributed uniformly on the set of $n$-bit strings. Then this can be represented in $n$ bits; and the Shannon entropy is $n$, so it is natural to say that its entropy is $n$ bits.
Yes, it's definitely related to the fact that if the system has $N$ different possible states, its state can be represented in $\lg N$ bits. If the state is uniformly distributed over these $N$ different possibilities, a little bit of calculation will show that its Shannon entropy is $\lg N$. So, it makes a lot of sense to call its entropy $\lg N$ bits.
Yes, it's definitely related to the fact that if the system has $N$ different possible states, its state can be represented in $\lg N$ bits. If the state is uniformly distributed over these $N$ different possibilities, a little bit of calculation will show that its Shannon entropy is $\lg N$. So, it makes a lot of sense to call its entropy $\lg N$ bits.
Context
StackExchange Computer Science Q#117217, answer score: 3
Revisions (0)
No revisions yet.