HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Depth of a leaf in a prefix tree (Huffman Coding)

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
depthhuffmanleafprefixcodingtree

Problem

So, regarding Huffman Coding we know that the optimal way of "storing" it is using a full binary tree (where all nodes have either 0 or 2 children). All leaves are the symbols $i$ of the alphabet we want to encode (all $i$ have a frequency of appearance$f_i$) and if we consider the freq. $f_i$ of each leaf (freq. of that character) then the freq. of a node is the sum of the frequencies of its children, therefore the root has $f_i = 100$ or $f_i = 1$.

Visual example (taken from Introduction to Algorithms) because I'm not really good at explaining this stuff:

Now I want to prove that the depth of a leaf (a character) in this prefix tree is $O(\log_{2}{(1/f_I))}$, where $f_i$ is the freq. of the character $i$.

I've tried by induction, where the base case would be for $\log_{2}{(1/1) = 0 }$ and indeed the depth of the root is 0. But I don't know how to proceed.

I've been thinking about how in a complete binary tree the depth of a leaf is $\log_{2}{n}$ with n being the number of nodes in that level (I'm not 100% sure about this) and how de depth of a character i in the prefix tree = length of the binary code (for example depth of d is 3 and its coding is 111) but I can't see how I could use those two things (if they are even useful).

What do you think about it? I don't know how to approach it :(

Btw thanks for your time and have a nice day!

Solution

Consider a symbol $\sigma$ whose frequency is $f$. During the course of Huffman's algorithm, $\sigma$ is matched with other nodes, and the depth of $\sigma$ is the number of times this happens.

At the first time that $\sigma$ is matched, it is either the symbol with smallest frequency or with second smallest frequency. In the former case, it is merged into a new symbol whose frequency is at least $2f$. In the latter case, after $\sigma$ is merged with the symbol of smallest frequency, all symbols have frequency at least $f$, and so the next time that it is merged, the merged symbol will have frequency at least $2f$.

Summarizing, it takes at most two mergings to double the frequency of a symbol. If its original frequency is $f$, then this shows that the depth of $f$ is at most $2\lceil \log_2(1/f) \rceil = O(\log(1/f))$.

Context

StackExchange Computer Science Q#82475, answer score: 3

Revisions (0)

No revisions yet.