The Huffman Algorithm We use Huffman algorithm to

  • Slides: 4
Download presentation
The Huffman Algorithm We use Huffman algorithm to encode a long message as a

The Huffman Algorithm We use Huffman algorithm to encode a long message as a long bit string - by assigning a bit string code to each symbol of the alphabet and - concatenating the individual codes of the symbols making up the message. Example: Alphabet consists of the four symbols A, B, C, D. Symbol Code A 010 B 100 C 000 D 111 The message ABACD would be encoded as 0101000111. Such encoding is inefficient.

The Huffman Algorithm If we examine any message, we will see that some letters

The Huffman Algorithm If we examine any message, we will see that some letters appear more frequently than others. If the frequently appeared letters are assigned shorter bit strings, then the length of the encoded message will be substantially reduced. Symbol Code A 0 B 110 C 10 D 111 The message ABACD would be encoded as 0110010111. Such encoding is efficient.

Huffman Tree The message is ABACCDA Choose two symbols with 0 is assigned for

Huffman Tree The message is ABACCDA Choose two symbols with 0 is assigned for left branch smallest frequency ( B and D). 1 is assigned for right branch Combine these two symbols into the single symbol BD of frequency 2. Next two symbols ACBD, 7 with smallest frequency are C and BD 0 1 A, 3 0 C, 2 Symbol code A 0 B 110 C 10 D 111 CBD, 4 1 0 B, 1 BD, 2 1 D, 1

Huffman Algorithm • Generally, codes are not constructed on the basis of the frequency

Huffman Algorithm • Generally, codes are not constructed on the basis of the frequency of characters within a single massage alone. • Codes are constructed on the basis of their frequency within a whole set of messages. • The same code set is then used for each message. • For example, if messages consists of English words, the known relative frequency of occurrence of the letters of the alphabet in English language might be used. • The relative frequency of the letters in any single message is not necessarily the same.