Huffman
Description
The Huffman algorithm produces a binary code that can be used to
store or transmit text. Unlike ASCII, Huffman code assigns a variable
number of bits to each character it encodes.
To encode a message, you must have the frequency distribution for each
character in the alphabet. Given an alphabet and such a distribution, the
Huffman code is constructed as follows:
- Form a forest of trees, each tree contains a single letter and
the weight for that letter.
- Until a single tree exists:
- Merge the two trees with the lowest frequencies
- Place the tree with the lowest frequency on the left, the other
tree on the right.
- The new root node for the tree contains the sum of the frequencies
of the two subtrees.
The code for each character is constructed by tracing a path in the tree to from
the root of the tree to that letter.
If the letter is on the right side of the tree, a 0 is added to it's bit
pattern, otherwise a 1 is added.
You are to write a program that reads in a typical message for a given
alphabet. From this message you should construct a frequency distribution
and a Huffman code for that alphabet.
Input Specification
The file HUFF.IN consists of a series of printable ascii characters. Other than
end of file marker, there will be no non-printable characters.
Output Specification
Print a table (in ASCII order) of all of the characters in the alphabet, there
frequencies, and your constructed Huffman code. The character should be
centered, and the count and code left justified.
Sample Input
abcdabdababaaaa
Sample Output
ASCII Count Huffman Code
a 8 1
b 4 01
c 1 000
d 2 001
This problem was from the 1999 PACISE contest at Millersville University.
(Slightly modified problem statement)