Huffman Coding Slide 1 Huffman Coding Huffman coding
- Slides: 32
Huffman Coding Slide 1
Huffman Coding Huffman coding Compressed file Original file • A technique to compress data effectively – Usually between 20%-90% compression • Lossless compression – No information is lost – When decompress, you get the original file Slide 2
Huffman Coding: Applications Huffman coding Compressed file Original file • Saving space – Store compressed files instead of original files • Transmitting files or data – Send compressed data to save transmission time and power • Encryption and decryption – Cannot read the compressed file without knowing the “key” Slide 3
Main Idea: Frequency-Based Encoding • • • Assume in this file only 6 characters appear – E, A, C, T, K, N Character Frequency The frequencies are: Option I (No Compression) – Each character = 1 Byte (8 bits) – Total file size = 14, 700 * 8 = 117, 600 bits • Option 2 (Fixed size compression) – We have 6 characters, so we need 3 bits to encode them – Total file size = 14, 700 * 3 = 44, 100 bits E 10, 000 A 4, 000 C 300 T 200 K 100 N 100 Character Fixed Encoding E 000 A 001 C 010 T 100 K 110 N 111
Main Idea: Frequency-Based Encoding (Cont’d) • • • Assume in this file only 6 characters appear – E, A, C, T, K, N Character Frequency The frequencies are: Option 3 (Huffman compression) – Variable-length compression – Assign shorter codes to more frequent characters and longer codes to less frequent characters – Total file size: (10, 000 x 1) + (4, 000 x 2) + (300 x 3) + (200 x 4) + (100 x 5) = 20, 700 bits E 10, 000 A 4, 000 C 300 T 200 K 100 N 100 Char. Huffman. Encoding E 0 A 10 C 110 T 1110 K 11110 N 11111
Huffman Coding • A variable-length coding for characters – More frequent characters shorter codes – Less frequent characters longer codes • It is not like ASCII coding where all characters have the same coding length (8 bits) • Two main questions – How to assign codes (Encoding process)? – How to decode (from the compressed file, generate the original file) (Decoding process)? Slide 6
Decoding for fixed-length codes is much easier Character Fixed-length Encoding E 000 A 001 C 010 T 100 K 110 N 111 01000110111000 Divide into 3’s 010 001 100 111 000 Decode C A T K N E Slide 7
Decoding for variable-length codes is not that easy… 000001 Character Variable-length Encoding E 0 A 00 C 001 … … … It means what? ? ? EEEC EAC AEC Huffman encoding guarantees to avoid this uncertainty …Always have a single decoding Slide 8
Huffman Algorithm • Step 1: Get Frequencies – Scan the file to be compressed and count the occurrence of each character – Sort the characters based on their frequency • Step 2: Build Tree & Assign Codes – Build a Huffman-code tree (binary tree) – Traverse the tree to assign codes • Step 3: Encode (Compress) – Scan the file again and replace each character by its code • Step 4: Decode (Decompress) – Huffman tree is the key to decompress the file Slide 9
Step 1: Get Frequencies Input File: Eerie eyes seen near lake. Slide 10
Step 2: Build Huffman Tree & Assign Codes • It is a binary tree in which each character is a leaf node – Initially each node is a separate root • At each step – Select two roots with smallest frequency and connect them to a new parent (Break ties arbitrary) [The greedy choice] – The parent will get the sum of frequencies of the two child nodes • Repeat until you have one root Slide 11
Example Each char. has a leaf node with its frequency Slide 12
Find the smallest two frequencies…Replace them with their parent (priority queue) Slide 13
Find the smallest two frequencies…Replace them with their parent Slide 14
Find the smallest two frequencies…Replace them with their parent Slide 15
Find the smallest two frequencies…Replace them with their parent Slide 16
Find the smallest two frequencies…Replace them with their parent Slide 17
Find the smallest two frequencies…Replace them with their parent Slide 18
Find the smallest two frequencies…Replace them with their parent Slide 19
Find the smallest two frequencies…Replace them with their parent Slide 20
Find the smallest two frequencies…Replace them with their parent Slide 21
Find the smallest two frequencies…Replace them with their parent Slide 22
Find the smallest two frequencies…Replace them with their parent Slide 23
Now we have a single root…This is the Huffman Tree Slide 24
Lets Analyze Huffman Tree • • All characters are at the leaf nodes The number at the root = # of characters in the file High-frequency chars (E. g. , “e”) are near the root Low-frequency chars are far from the root Slide 25
Lets Assign Codes • • Traverse the tree – Any left edge add label 0 – As right edge add label 1 The code for each character is its root-to-leaf label sequence Slide 26
Lets Assign Codes 1 0 0 • • 1 0 0 1 0 1 Traverse the tree – Any left edge add label 0 – As right edge add label 1 The code for each character is its root-to-leaf label sequence Slide 27
Lets Assign Codes Prefix Free Coding Table • • Traverse the tree – Any left edge add label 0 – As right edge add label 1 The code for each character is its root-to-leaf label sequence Slide 28
Huffman Algorithm • Step 1: Get Frequencies – Scan the file to be compressed and count the occurrence of each character – Sort the characters based on their frequency • Step 2: Build Tree & Assign Codes – Build a Huffman-code tree (binary tree) – Traverse the tree to assign codes • Step 3: Encode (Compress) – Scan the file again and replace each character by its code • Step 4: Decode (Decompress) – Huffman tree is the key to decompess the file Slide 29
Step 3: Encode (Compress) The File Input File: Eerie eyes seen near lake. Coding Table + Generate the encoded file 0000 10 1100 000110 …. Notice that no code is prefix to any other code Ensures the decoding will be unique (Unlike Slide 8) Slide 30
Step 4: Decode (Decompress) • Must have the encoded file + the coding tree • Scan the encoded file – For each 0 move left in the tree – For each 1 move right – Until reach a leaf node Emit that character and go back to the root 0000 10 1100 000110 Generate the original file …. + Eerie … Slide 31
Huffman Algorithm • Step 1: Get Frequencies – Scan the file to be compressed and count the occurrence of each character – Sort the characters based on their frequency • Step 2: Build Tree & Assign Codes – Build a Huffman-code tree (binary tree) – Traverse the tree to assign codes • Step 3: Encode (Compress) – Scan the file again and replace each character by its code • Step 4: Decode (Decompress) – Huffman tree is the key to decompess the file Slide 32
- Axial coding
- Open coding adalah
- Toe and heel dance
- Go go gophers huffman coding
- Greedy huffman coding
- Prefixfree
- Huffman codin
- Huffman visualization
- Eee
- Adaptive huffman code
- Huffman visualization
- Huffman entropy
- Cse 326
- Coding
- Variable length coding in digital image processing
- Huffman coding example with probabilities
- Huffman coding
- Huffman coding example with probabilities
- Huffman coding example with probabilities
- Huffman coding
- Axial coding vs open coding
- Coding dna and non coding dna
- How to factor
- Pohon tree
- Lll algoritmus
- Randy huffman west virginia
- Kode huffman matematika diskrit
- Macchina di huffman
- Huffman codierung abrakadabra
- Codice di huffman
- Huffman
- Huffman tree traversal
- Huffman tree visualization