Data Compression Lecture 3 Coding Recap Coding rate
- Slides: 19
Data Compression Lecture 3
Coding Recap • Coding rate is the average number of bits used to represent a symbol from a source. • For a given probability model, the entropy is the lowest rate at which the source can be coded. • It has been shown that Huffman coding will generate whose rate is within p_max + 0. 086 • Therefore, in Huffman coding, when the alphabet size is large, the amount of deviation from the entropy is quite small, and vice versa
Coding Recap • One solution for this problem is blocking in Huffman coding. In which, it is more efficient to generate codewords for groups or sequences of symbols rather than to generate a separate codeword for each symbol in a sequence. • In order to find the Huffman coding for a sequence of length m, we need codewords for all possible sequences of length m. • This causes an exponential growth in the size of the code book.
Arithmetic coding • We need a way of assigning codewords to particular sequences with out having to generate a codes for all sequences of that length. • Rather than separating the input into component symbols and replacing each with a code, arithmetic encodes the entire message with a number (tag). • Firstly, a unique identifier or tag is generated for a sequence. Secondly, this tag is then given a unique binary code.
Arithmetic coding • Entropy encoding • Lossless data compression • Variable length coding
Arithmetic coding is based on the concept of interval subdividing. – In arithmetic coding a source ensemble is represented by an interval between 0 and 1 on the real number line. – Each symbol of the ensemble narrows this interval. – As the interval becomes smaller, the number of bits needed to specify it grows. – Arithmetic coding assumes an explicit probabilistic model of the source. – It uses the probabilities of the source messages to successively narrow the interval used to represent the ensemble. • A high probability message narrows the interval less than a low probability message, so that high probability messages contribute fewer bits to the coded ensemble
Arithmetic coding • Assume we know the probabilities of each symbol of the data source, – we can allocate to each symbol an interval with width proportional to its probability, and each of the intervals does not overlap with others. – This can be done if we use the cumulative probabilities as the two ends of each interval. Therefore, the two ends of each symbol x amount to Q[x-1] and Q[x]. – Symbol x is said to own the range [Q[x-1], Q[x]).
Arithmetic Encoding We begin with the interval [0, 1) and subdivide the interval iteratively. • For each symbol entered, the current interval is divided according to the probabilities of the alphabet. • The interval corresponding to the symbol is picked as the interval to be further proceeded with. • The procedure continues until all symbols in the message have been processed. • Since each symbol's interval does not overlap with others, for each possible message there is a unique interval assigned. • We can represent the message with the interval's two ends [L, H). In fact, taking any single value in the interval as the encoded code is enough, and usually the left end L is selected.
Encoding Example Symbol, x Probability, N[x] [Q[x-1], Q[x]) A 0. 4 0. 0, 0. 4 B 0. 3 0. 4, 0. 7 C 0. 2 0. 7, 0. 9 D 0. 1 0. 9, 1. 0
Encoding Algorithm L = 0. 0; H = 1. 0; While ( (x = getc(input)) != EOF ) { R = (H-L); H = L + R * Q[x]; L = L + R * Q[x-1]; } Output(L); R is the interval range, and H and L are two ends of the current code interval. x is the new symbol to be encoded. H and L are initialized to 0 and 1 respectively
Utilized equations
To encode the sequence 1 3 2 1
Arithmetic Decoding • Lets decode the tag value 0. 772352 • Given
Arithmetic Decoding • The possible value for is 2
Arithmetic Decoding
Decoding Algorithm q When decoding the code v is placed on the current code interval to find the symbol x so that Q[x-1] <= code < Q[x]. The procedure iterates until all symbols are decoded. v = input_code(); for (; ; ) { x = find_symbol_straddling_this_range(v); putc(x); R = Q[x] – Q[x-1]; v = (v – Q[x-1])/R; }
Generating a binary code
Binary code example
Binary code • Is uniquely decodable as it satisfies the prefix property
- Contoh open coding
- Contoh open coding, axial coding selective coding
- Adaptive huffman code
- Coding redundancy
- Data compression in data mining
- Cpr compression rate
- 01:640:244 lecture notes - lecture 15: plat, idah, farad
- History of qualitative research
- Coding dna and non coding dna
- What is lossy data compression
- Image compression
- Which are uniquely decodable codes
- Principles of data compression
- Compression test is carried on which material
- Introduction to data compression
- Lossless compression in multimedia
- Jpeg still image data compression standard
- Jpeg still image data compression standard
- Summary of shawshank redemption
- Great gatsby chapters 8-9 summary