HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Encode-symbol for Huffman tree

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
huffmansymbolforencodetree

Problem

From the text:


Exercise 2.68. The encode procedure
takes as arguments a message and a
tree and produces the list of bits
that gives the encoded message.

(define (encode message tree)
  (if (null? message)
      '()
      (append (encode-symbol (car message) tree)
              (encode (cdr message) tree))))



Encode-symbol is a procedure, which
you must write, that returns the list
of bits that encodes a given symbol
according to a given tree. You should
design encode-symbol so that it
signals an error if the symbol is not
in the tree at all. Test your
procedure by encoding the result you
obtained in exercise 2.67 with the
sample tree and seeing whether it is
the same as the original sample
message.

The textbook provides the following definitions:

```
(define (make-leaf symbol weight)
(list 'leaf symbol weight))
(define (leaf? object)
(eq? (car object) 'leaf))
(define (symbol-leaf x) (cadr x))
(define (weight-leaf x) (caddr x))

(define (make-code-tree left right)
(list left
right
(append (symbols left) (symbols right))
(+ (weight left) (weight right))))

(define (left-branch tree) (car tree))

(define (right-branch tree) (cadr tree))
(define (symbols tree)
(if (leaf? tree)
(list (symbol-leaf tree))
(caddr tree)))

(define (weight tree)
(if (leaf? tree)
(weight-leaf tree)
(cadddr tree)))

(define (decode bits tree)
(define (decode-1 bits current-branch)
(if (null? bits)
'()
(let ((next-branch
(choose-branch (car bits) current-branch)))
(if (leaf? next-branch)
(cons (symbol-leaf next-branch)
(decode-1 (cdr bits) tree))
(decode-1 (cdr bits) next-branch)))))
(decode-1 bits tree))
(define (choose-branch bit branch)
(cond ((= bit 0) (left-branch branch))
((= bit 1) (right-branch branch))
(else (error "bad bit -- CHOOSE-BRANCH" bit))))
(define (adjoin-set x

Solution

Wouldn't it be easier to unpack the tree into a dictionary mapping each symbol to its corresponding bit string? Then you could simply look up each symbol in the input to generate the corresponding output bits.

EDIT:
As suggested by syb0rg, here is an implementation (C#, I'm afraid -- my Lisp is far too rusty -- although it's almost pure). The part pertaining to my suggestion above lives in the HuffmanCodes function at the end.

void Main()
{
    var corpus = @"Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.";
    var hcs = HuffmanCodes(corpus);
    Console.WriteLine(hcs);
}

Dictionary Histogram(string corpus) {
    var hg = new Dictionary();
    foreach (var x in corpus) {
        int f;
        hg.TryGetValue(x, out f);
        hg[x] = f + 1;
    }
    return hg;
}

class HuffTree {
    internal char? Sym; // Non-null iff this is a leaf node with no children.
    internal int Freq;
    internal HuffTree L;
    internal HuffTree R;
}

// Oh, for a priority queue.  This is *really* inefficient!
HuffTree HuffmanTree(string corpus) {
    var hg = Histogram(corpus);
    var hts = hg.Keys.Select(x => new HuffTree { Sym = x, Freq = hg[x] }).OrderBy(t => t.Freq).ToList();
    while (2  t.Freq).ToList();
    }
    return hts.First();
}

Dictionary HuffmanCodes(string corpus) {
    var codes = new Dictionary();
    Action a = null; // Sweet recursion.
    a = (ht, code) => {
        if (ht.Sym != null) {
            codes[(char)ht.Sym] = code;
        } else {
            a(ht.L, code + "0");
            a(ht.R, code + "1");
        }
    };
    a(HuffmanTree(corpus), "");
    return codes;
}

Code Snippets

void Main()
{
    var corpus = @"Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.";
    var hcs = HuffmanCodes(corpus);
    Console.WriteLine(hcs);
}

Dictionary<char, int> Histogram(string corpus) {
    var hg = new Dictionary<char, int>();
    foreach (var x in corpus) {
        int f;
        hg.TryGetValue(x, out f);
        hg[x] = f + 1;
    }
    return hg;
}

class HuffTree {
    internal char? Sym; // Non-null iff this is a leaf node with no children.
    internal int Freq;
    internal HuffTree L;
    internal HuffTree R;
}

// Oh, for a priority queue.  This is *really* inefficient!
HuffTree HuffmanTree(string corpus) {
    var hg = Histogram(corpus);
    var hts = hg.Keys.Select(x => new HuffTree { Sym = x, Freq = hg[x] }).OrderBy(t => t.Freq).ToList();
    while (2 <= hts.Count) {
        var leasts = hts.Take(2).ToList();
        var l = leasts[0];
        var r = leasts[1];
        var newHt = new HuffTree { Freq = l.Freq + r.Freq, L = l, R = r };
        hts = hts.Skip(2).Concat(new HuffTree[] { newHt }).OrderBy(t => t.Freq).ToList();
    }
    return hts.First();
}

Dictionary<char, string> HuffmanCodes(string corpus) {
    var codes = new Dictionary<char, string>();
    Action<HuffTree, string> a = null; // Sweet recursion.
    a = (ht, code) => {
        if (ht.Sym != null) {
            codes[(char)ht.Sym] = code;
        } else {
            a(ht.L, code + "0");
            a(ht.R, code + "1");
        }
    };
    a(HuffmanTree(corpus), "");
    return codes;
}

Context

StackExchange Code Review Q#2043, answer score: 4

Revisions (0)

No revisions yet.