LempelZiv Compression Techniques Classification of Lossless Compression techniques
- Slides: 27
Lempel-Ziv Compression Techniques • Classification of Lossless Compression techniques • Introduction to Lempel-Ziv Encoding: LZ 77 & LZ 78 • LZ 78 – Encoding Algorithm – Decoding Algorithm • LZW – Encoding Algorithm – Decoding Algorithm 1
Classification of Lossless Compression Techniques Recall what we studied before: • Lossless Compression techniques are classified into static, adaptive (or dynamic), and hybrid. • Static coding requires two passes: one pass to compute probabilities (or frequencies) and determine the mapping, and a second pass to encode. • Examples of Static techniques: Static Huffman Coding • All of the adaptive methods are one-pass methods; only one scan of the message is required. • Examples of adaptive techniques: LZ 77, LZ 78, LZW, and Adaptive Huffman Coding 2
Introduction to Lempel-Ziv Encoding • Data compression up until the late 1970's mainly directed towards creating better methodologies for Huffman coding. • An innovative, radically different method was introduced in 1977 by Abraham Lempel and Jacob Ziv. • This technique (called Lempel-Ziv) actually consists of two considerably different algorithms, LZ 77 and LZ 78. • Due to patents, LZ 77 and LZ 78 led to many variants: LZ 77 Variants LZR LZSS LZB LZH LZ 78 Variants LZW LZC LZT LZMW LZJ LZFG • The zip and unzip use the LZH technique while UNIX's compress methods belong to the LZW and LZC classes. 3
LZ 78 Encoding Algorithm LZ 78 inserts one- or multi-character, non-overlapping, distinct patterns of the message to be encoded in a Dictionary. The multi-character patterns are of the form: C 0 C 1. . . Cn-1 Cn. The prefix of a pattern consists of all the pattern characters except the last: C 0 C 1. . . Cn-1 LZ 78 Output: Note: The dictionary is usually implemented as a hash table. 4
LZ 78 Encoding Algorithm (cont’d) Dictionary empty ; Prefix empty ; Dictionary. Index 1; while(character. Stream is not empty) { Char next character in character. Stream; if(Prefix + Char exists in the Dictionary) Prefix + Char ; else { if(Prefix is empty) Code. Word. For. Prefix 0 ; else Code. Word. For. Prefix Dictionary. Index for Prefix ; Output: (Code. Word. For. Prefix, Char) ; insert. In. Dictionary( ( Dictionary. Index , Prefix + Char) ); Dictionary. Index++ ; Prefix empty ; } } if(Prefix is not empty) { Code. Word. For. Prefix Dictionary. Index for Prefix; Output: (Code. Word. For. Prefix , ) ; } 5
Example 1: LZ 78 Encoding Encode (i. e. , compress) the string ABBCBCABABCAAB using the LZ 78 algorithm. The compressed message is: (0, A)(0, B)(2, C)(3, A)(2, A)(4, A)(6, B) Note: The above is just a representation, the commas and parentheses are not transmitted; we will discuss the actual form of the compressed message later on in slide 12. 6
Example 1: LZ 78 Encoding (cont’d) 1. A is not in the Dictionary; insert it 2. B is not in the Dictionary; insert it 3. B is in the Dictionary. BC is not in the Dictionary; insert it. 4. B is in the Dictionary. BCA is not in the Dictionary; insert it. 5. B is in the Dictionary. BA is not in the Dictionary; insert it. 6. B is in the Dictionary. BCAA is not in the Dictionary; insert it. 7. B is in the Dictionary. BCAAB is not in the Dictionary; insert it. 7
Example 2: LZ 78 Encoding Encode (i. e. , compress) the string BABAABRRRA using the LZ 78 algorithm. The compressed message is: (0, B)(0, A)(1, A)(2, B)(0, R)(5, R)(2, ) 8
Example 2: LZ 78 Encoding (cont’d) 1. B is not in the Dictionary; insert it 2. A is not in the Dictionary; insert it 3. B is in the Dictionary. BA is not in the Dictionary; insert it. 4. A is in the Dictionary. AB is not in the Dictionary; insert it. 5. R is not in the Dictionary; insert it. 6. R is in the Dictionary. RR is not in the Dictionary; insert it. 7. A is in the Dictionary and it is the last input character; output a pair containing its index: (2, ) 9
Example 3: LZ 78 Encoding Encode (i. e. , compress) the string AAAAA using the LZ 78 algorithm. 1. A is not in the Dictionary; insert it 2. A is in the Dictionary AA is not in the Dictionary; insert it 3. A is in the Dictionary. AAA is not in the Dictionary; insert it. 4. A is in the Dictionary. AAA is in the Dictionary and it is the last pattern; output a pair containing its index: (3, ) 10
LZ 78 Encoding: Number of bits transmitted • Example: Uncompressed String: ABBCBCABABCAAB Number of bits = Total number of characters * 8 = 18 * 8 = 144 bits • Suppose the codewords are indexed starting from 1: Compressed string( codewords): (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B) Codeword index 1 2 3 4 5 6 7 • Each code word consists of an integer and a character: • The character is represented by 8 bits. • The number of bits n required to represent the integer part of the codeword with index i is given by: • Alternatively number of bits required to represent the integer part of the codeword with index i is the number of significant bits required to represent the integer i – 1 11
LZ 78 Encoding: Number of bits transmitted (cont’d) Codeword index Bits: (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B) 1 2 3 4 5 6 7 (1 + 8) + (2 + 8) + (3 + 8) = 71 bits The actual compressed message is: 0 A 0 B 10 C 11 A 010 A 100 A 110 B where each character is replaced by its binary 8 -bit ASCII code. 12
LZ 78 Decoding Algorithm Dictionary empty ; Dictionary. Index 1 ; while(there are more (Code. Word, Char) pairs in codestream){ Code. Word next Code. Word in codestream ; Char character corresponding to Code. Word ; if(Code. Word = = 0) String empty ; else String string at index Code. Word in Dictionary ; Output: String + Char ; insert. In. Dictionary( (Dictionary. Index , String + Char) ) ; Dictionary. Index++; } Summary: Ø Ø input: (CW, character) pairs output: if(CW == 0) output: current. Character else output: string. At. Index CW + current. Character Ø Insert: current output in dictionary 13
Example 1: LZ 78 Decoding Decode (i. e. , decompress) the sequence (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B) The decompressed message is: ABBCBCABABCAAB 14
Example 2: LZ 78 Decoding Decode (i. e. , decompress) the sequence (0, B) (0, A) (1, A) (2, B) (0, R) (5, R) (2, ) The decompressed message is: BABAABRRRA 15
Example 3: LZ 78 Decoding Decode (i. e. , decompress) the sequence (0, A) (1, A) (2, A) (3, ) The decompressed message is: AAAAA 16
LZW Encoding Algorithm • If the message to be encoded consists of only one character, LZW outputs the code for this character; otherwise it inserts two- or multi-character, overlapping*, distinct patterns of the message to be encoded in a Dictionary. *The last character of a pattern is the first character of the next pattern. • The patterns are of the form: C 0 C 1. . . Cn-1 Cn. The prefix of a pattern consists of all the pattern characters except the last: C 0 C 1. . . Cn-1 LZW output if the message consists of more than one character: Ø If the pattern is not the last one; output: The code for its prefix. Ø If the pattern is the last one: • if the last pattern exists in the Dictionary; output: The code for the pattern. • If the last pattern does not exist in the Dictionary; output: code(last. Prefix) then output: code(last. Character) Note: LZW outputs codewords that are 12 -bits each. Since there are 212 = 4096 codeword possibilities, the minimum size of the Dictionary is 4096; however since the Dictionary is usually implemented as a hash table its size is larger than 4096. 17
LZW Encoding Algorithm (cont’d) Initialize Dictionary with 256 single character strings and their corresponding ASCII codes; Prefix first input character; Code. Word 256; while(not end of character stream){ Char next input character; if(Prefix + Char exists in the Dictionary) Prefix + Char; else{ Output: the code for Prefix; insert. In. Dictionary( (Code. Word , Prefix + Char) ) ; Code. Word++; Prefix Char; } } Output: the code for Prefix; 18
Example 1: Compression using LZW Encode the string BABAABAAA by the LZW encoding algorithm. 1. BA is not in the Dictionary; insert BA, output the code for its prefix: code(B) 2. AB is not in the Dictionary; insert AB, output the code for its prefix: code(A) 3. BA is in the Dictionary. BAA is not in Dictionary; insert BAA, output the code for its prefix: code(BA) 4. AB is in the Dictionary. ABA is not in the Dictionary; insert ABA, output the code for its prefix: code(AB) 5. AA is not in the Dictionary; insert AA, output the code for its prefix: code(A) 6. AA is in the Dictionary and it is the last pattern; output its code: code(AA) • The compressed message is: <66><65><256><257><65><260> • Note: Each of the characters < and > is not sent; each code word is sent using 12 bits 19
Example 2: Compression using LZW Encode the string BABAABRRRA by the LZW encoding algorithm. 1. BA is not in the Dictionary; insert BA, output the code for its prefix: code(B) 2. AB is not in the Dictionary; insert AB, output the code for its prefix: code(A) 3. BA is in the Dictionary. BAA is not in Dictionary; insert BAA, output the code for its prefix: code(BA) 4. AB is in the Dictionary. ABR is not in the Dictionary; insert ABR, output the code for its prefix: code(AB) 5. RR is not in the Dictionary; insert RR, output the code for its prefix: code(R) 6. RR is in the Dictionary. RRA is not in the Dictionary and it is the last pattern; insert RRA, output code for its prefix: code(RR), then output code for last character: code(A) The compressed message is: <66><65><256><257><82><260> <65> 20
LZW: Number of bits transmitted Example: Uncompressed String: aaabbbbbbaabaaba Number of bits = Total number of characters * 8 = 16 * 8 = 128 bits Compressed string (codewords): <97><256><98><259><257><261> Number of bits = Total Number of codewords * 12 = 7 * 12 = 84 bits Note: Each codeword is 12 bits because the minimum Dictionary size is taken as 4096, and 212 = 4096 21
LZW Decoding Algorithm The LZW decompressor creates the same string table during decompression. Initialize Dictionary with 256 ASCII codes and corresponding single character strings as their translations; Previous. Code. Word first input code; Output: string(Previous. Code. Word) ; Char character(first input code); Code. Word 256; while(not end of code stream){ Current. Code. Word next input code ; if(Current. Code. Word exists in the Dictionary) String string(Current. Code. Word) ; else String string(Previous. Code. Word) + Char ; Output: String; Char first character of String ; insert. In. Dictionary( (Code. Word , string(Previous. Code. Word) + Char ) ); Previous. Code. Word Current. Code. Word ; Code. Word++ ; } 22
LZW Decoding Algorithm (cont’d) Summary of LZW decoding algorithm: output: string(first Code. Word); code. Word = 256; while(there are more Code. Words){ if(Current. Code. Word is in the Dictionary) output: string(Current. Code. Word); else output: Previous. Output + Previous. Output first character; insert at Dictionary[code. Word++]: Previous. Output + Current. Output first character; } 23
Example 1: LZW Decompression Use LZW to decompress the output sequence <66> <65> <256> <257> <65> <260> 1. 2. 3. 4. 5. 6. 66 is in Dictionary; output string(66) i. e. B 65 is in Dictionary; output string(65) i. e. A, insert BA 256 is in Dictionary; output string(256) i. e. BA, insert AB 257 is in Dictionary; output string(257) i. e. AB, insert BAA 65 is in Dictionary; output string(65) i. e. A, insert ABA 260 is not in Dictionary; output previous output + previous output first character: AA, insert AA 24
Example 2: LZW Decompression Decode the sequence <67> <70> <256> <258> <259> <257> by LZW decode algorithm. 1. 2. 3. 4. 5. 6. 67 is in Dictionary; output string(67) i. e. C 70 is in Dictionary; output string(70) i. e. F, insert CF 256 is in Dictionary; output string(256) i. e. CF, insert FC 258 is not in Dictionary; output previous output + C i. e. CFC, insert CFC 259 is not in Dictionary; output previous output + C i. e. CFCC, insert CFCC 257 is in Dictionary; output string(257) i. e. FC, insert CFCCF 25
LZW: Limitations • What happens when the dictionary gets too large? • One approach is to clear entries 256 -4095 and start building the dictionary again. • The same approach must also be used by the decoder. 26
Exercises 1. Use LZ 78 to trace encoding the string SATATASACITASA. 2. Write a Java program that encodes a given string using LZ 78. 3. Write a Java program that decodes a given set of encoded codewords using LZ 78. 4. Use LZW to trace encoding the string ABRACADABRA. 5. Write a Java program that encodes a given string using LZW. 6. Write a Java program that decodes a given set of encoded codewords using LZW. 27
- Examples of lossy and lossless compression
- Lossless compression algorithms in multimedia
- 472
- Image compression model in digital image processing
- Lossy and lossless compression
- Lossless compression
- Lossless compression in multimedia
- Lossless transmission line problems
- Sparameters
- Matched line transmission
- Lossy medium example
- Lossless join decomposition adalah
- Lossless join
- S-parameters basics
- Lossless
- Lossless decomposition in dbms ppt
- Winters shunt
- 5 schedule compression techniques
- Audio compression techniques
- Video compression techniques
- Les fonctions techniques et les solutions techniques
- Classification of obturation techniques
- Classification alternative techniques in data mining
- Classification of material characterization techniques
- Ratio scale questions examples
- Eager learning algorithm example
- Manifold classification example
- Traditional classification vs modern classification