Motion Compensated Prediction and the Role of the
- Slides: 71
Motion Compensated Prediction and the Role of the DCT in Video Coding Michael Horowitz Applied Video Compression michael@avcompression. com (c) 2008 Michael Horowitz
Outline • Overview block-based hybrid motion compensated predictive video coding – ITU-T standards H. 261, H. 263, H. 264 – ISO/IEC standards: MPEG-1, MPEG-2 & MPEG-4 • Survey motion estimation & compensation • Discrete cosine transform (DCT) – Coding efficiency – Computational complexity – Perceptual implications (c) 2008 Michael Horowitz
Block-Based Hybrid Motion Compensated Predictive Coding • Video picture partitioned into macroblocks • Macroblock (MB) has three components – One luma • “Y”, represents “lightness” • 16 x 16 luma samples – Two chroma • “Cb” & “Cr”, represent color • 16 x 16, 8 x 16, or 8 x 8 chroma samples (c) 2008 Michael Horowitz
Block-Based Hybrid Motion Compensated Predictive Coding (continued) – Human Visual System more sensitive to luma • Chroma frequently sub-sampled • Sub-sampling examples 4: 4: 4 4: 2: 2 4: 2: 0 Y Cb Cr • Two coding modes for macroblocks (c) 2008 Michael Horowitz
Inter-Picture Macroblock Coding – Estimate motion of blocks from picture-to-picture – Search previously coded (reference) pictures Motion Estimate Location of input MB Search Region Reference Picture – Encode • Location of motion estimate (motion vector) • Difference between input MB and motion estimate (c) 2008 Michael Horowitz
Intra-Picture Macroblock Coding • Input MB coded using intra-picture prediction – Prediction derived from spatially adjacent MBs – Earlier algorithms offer no intra-picture prediction • Significantly lower coding efficiency than intercoded MBs at low data rates • Useful when motion estimate is poor • Can be used to stop error propagation (c) 2008 Michael Horowitz
Block-Based Hybrid Motion Compensated Predictive Coding (c) 2008 Michael Horowitz
Survey Motion Estimation and Motion Compensation • Motion models – Translational (focus of talk) • Location of kth motion compensated block – – (Xk, Yk) is location of kth input block – (MVx, k, MVy, k) is motion vector (MV) for kth block – Affine motion models • Rotation • Scaling • Video standards do not use affine models (c) 2008 Michael Horowitz
Motion Estimation • Estimate inter-picture block translation • Luma samples (and sometimes chroma) • Example – Distortion: Sum of Absolute Differences (SAD) • Low complexity • Commonly used in real-time production encoders – Find (MVx, k, MVy, k) that minimizes SAD between • Input block sk(i, j) • Motion compensated prediction in reference picture r(i, j) • Subject to search range (c) 2008 Michael Horowitz
Motion Estimation (continued) (Xk, Yk) Sample Locations Reference Picture r(i, j) Search Range • Fast motion estimation algorithms (c) 2008 Michael Horowitz
Fractional Sample Motion Estimation • Estimate content between samples • Example: bilinear interpolation x 1 ≤ x* < x 2 and y 1 ≤ y* < y 2 fx= (x*-x 1)/(x 2 -x 1) fy= (y*-y 1)/(y 2 -y 1) z(x 1, y 1) z(x 2, y 1) z(x*, y*) z(x 1, y 2) z(x 2, y 2) z(x*, y*) = (1 -fx)(1 -fy)z(x 1, y 1) + fx(1 -fy) z(x 2, y 1) + fxfy z(x 2, y 2) + (1 -fx)fyz(x 1, y 2) (c) 2008 Michael Horowitz
Fractional Sample Motion Estimation (continued) – H. 261 • No fractional sample motion estimation – MPEG-1, MPEG-2 and H. 263 • 1/2 -sample, bilinear interpolation – H. 264 | MPEG-4 AVC & SVC • Luma – 1/2 -sample, 6 -tap interpolation – 1/4 -sample, simple average • Chroma (1/8 -sample, bilinear) (c) 2008 Michael Horowitz
Fractional Sample Motion Estimation (continued) • Coding efficiency gain H. 263, [from Wang 2002] (c) 2008 Michael Horowitz
Multiple Motion Vectors per MB • One motion vector for each sub-block • H. 264 results [Bjontegaard 2001] (c) 2008 Michael Horowitz
Multiple Reference Pictures [Wiegand, Zhang, & Girod 1997] • Coding gains – Uncovered areas – More integer motion vector estimates Integer sample location t-3 t-2 t-1 Direction of motion t 0 Integer sample location (c) 2008 Michael Horowitz
Multiple Reference Pictures (Continued) • H. 263 Annex U [Horowitz 2000] (c) 2008 Michael Horowitz
Multi-Hypothesis Motion Compensated Prediction [Flierl, Wiegand & Girod 1998] • Linear combination of multiple predictions – One motion vector for each prediction – Bi-predicted pictures are special case (2 MVs) – Predictions may be forward & backward in time (c) 2008 Michael Horowitz
Multi-Hypothesis for H. 263 • Sequences Mobile & Calendar and Foreman • Results [Flierl 1998] (c) 2008 Michael Horowitz
Overlapped Block Motion Compensation [Orchard & Sullivan 1994] • Special case of multi-hypothesis coding • H. 263 advanced prediction mode (Annex F) – Overlapped block motion compensation • 1 coded + 2 “derived” motion vectors • Non-uniform spatial weighting of samples – 4 motion vectors per macroblock (c) 2008 Michael Horowitz
Rate-Distortion Optimization • MV resulting in lowest distortion often not optimal • Goal: Find best tradeoff between distortion and rate • Strategy [Everett III 1963], [Shoham & Gersho 1988] Total distortion Total bit-rate Distortion Rate for block k – Minimize Jk for each block k separately, using common (c) 2008 Michael Horowitz
Perceptual Tuning • • Prevent transparent foreground macroblocks Blurring of fast moving objects Deblocking filter Artifacts in the motion wake (c) 2008 Michael Horowitz
Coding Summary • Macroblock-based coding • Two basic macroblock coding modes – Inter-coded MB motion compensated prediction – Intra-coded MB (c) 2008 Michael Horowitz
1 -D Discrete Cosine Transform • Type II forward DCT [Ahmed et al. 1974] • Type II inverse DCT (c) 2008 Michael Horowitz
2 -Dimensional DCT • Forward • Inverse (c) 2008 Michael Horowitz
Basis Functions for 8 x 8 DCT (c) 2008 Michael Horowitz
Why Choose the DCT? • Coding efficiency • Computational complexity • Perceptual implications (c) 2008 Michael Horowitz
Coding Efficiency X X 1 Q 1 ^ X 1 X 2 ^ X 2 Q 2 ^ X • Source X = [X 1, X 2] – Xi is a Gaussian random variable – Mean = 0, Variance = i 2 • Rate of quantizer Qi is Ri (bits / index) – Total rate R = R 1 + R 2 (c) 2008 Michael Horowitz
Coding Efficiency (continued) • Distortion – Square error – High-rate assumption • High-rate implies R ≥ 3 bits / sample • Often works well for lower rates • Asymptotic Quantization Theory [Gray & Neuhoff 1998] – Total distortion (c) 2008 Michael Horowitz
Rate Allocation Problem • What is smallest D = D* subject to • Find optimal value for ? • Minimizing D with respect to R 1 yields (c) 2008 Michael Horowitz
Rate Allocation Problem (continued) It follows that and which implies (c) 2008 Michael Horowitz
Generalize for k Quantizers X 12 X X 3 X 1 X 2 Q 1 Q 2 Q 3 ^ X 12 2 ^ X 3 ^ X • Rate • Distortion • Recall (c) 2008 Michael Horowitz
Generalization (continued) • 2 quantizers with • Minimize subject to with respect to R 3 (c) 2008 Michael Horowitz
Generalization (continued) • It follows that • Generalize to k quantizers by induction (c) 2008 Michael Horowitz
Optimal Rate and Distortion [Huang & Schultheiss 1963] • Rate • Distortion (c) 2008 Michael Horowitz
Observations and Comments • #1 Optimal rate for Qi proportional to • #2 Optimal distortion • #3 In practice, systems use positive [Segall 1976] integer [Farber & Zeger 2005] Ri (c) 2008 Michael Horowitz
Question • Given Gaussian source X & fixed encoder structure (i. e. , k scalar quantizers) how can we minimize D subject to ? (c) 2008 Michael Horowitz
Transform Coding X 1 X X 2 Xk [Kramer & Mathews 1956] ^ ^ Y 1 X 1 Q 1 ^ ^ X Y 2 2 Q 2 T T-1 … ^ ^ Yk Yk Xk Qk ^ X • For orthogonal T (c) 2008 Michael Horowitz
Fact 1 • Karhunen-Loeve Transform (KLT) produces smallest. [Huang et al. 1963] – – a) Gaussian input random variables b) High-rate quantizers c) Rate of each quantizer is arbitrary real value d) Square error distortion measure (c) 2008 Michael Horowitz
Fact 2 • The autocorrelation matrix of the KLT transform vector is diagonal. – KLT coefficients are uncorrelated – There is no general theorem stating uncorrelated quantities can be more efficiently quantized than correlated ones (c) 2008 Michael Horowitz
Fact 3 • If KLT produces ≥ for , orthogonal produces then & Energy compaction (c) 2008 Michael Horowitz
Practical Considerations • KLT impractical for many systems – Computational complexity • Transform is signal dependent • Compute and apply transform for each input • Consider Fourier based transforms – Fast algorithms exist – Examine loss of coding efficiency resulting from loss of energy compaction (c) 2008 Michael Horowitz
Energy Compaction of Some Discrete Transforms • 1 x 32 block in natural images [Lohscheller] (c) 2008 Michael Horowitz
2 -D Energy Compaction [from Hedberg & Nilsson 2004] • KLT DCT • DFT (c) 2008 Michael Horowitz
Computational Complexity • Recall DCT may be derived from DFT – First N coefficients of 2 N-point DFT – Requires appropriate input sequence symmetry – Requries scaling [Tseng & Miller 1978] where fm is mth DFT coefficient • Leverage FFT to compute DCT (c) 2008 Michael Horowitz
Computational Complexity (continued) • 1 -D 8 -point DCT from 16 -point DFT – 13 mults, 29 adds [Arai et al. 1988] – 8 final scaling multiplies rolled into quantization • Net 5 mults, 29 adds best known • Fast 2 -D DCT (8 x 8) – Separable [from Pennebaker & Mitchell 1992] • 80 mults, 464 adds best known – Non-separable [Feig 1992] • 54 mults, 416 adds, 6 shifts (c) 2008 Michael Horowitz
Perceptual Implications • Contrast sensitivity of HVS – See last page of handout [Barlow & Mollen 1982] • Perceptually tuned quantization tables [Watson] • Filter coefficients prior to quantization – Shape frequency content of source – Exploit HVS contrast sensitivity (c) 2008 Michael Horowitz
Concluding Summary • Motion estimation & compensation – Translation-based motion models – Fractional sample motion estimation – Multiple motion vectors per macroblock – Multiple reference pictures – Multi-hypothesis motion compensated prediction – Overlapped block motion compensation (c) 2008 Michael Horowitz
Concluding Summary • DCT – Near optimal R-D performance for wide range of sources (Gaussian, high-rate assumptions) – Simple relationship to DFT fast – Perceptual relevance (c) 2008 Michael Horowitz
References • • • N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete cosine transform, ” IEEE Trans. Comput. , vol. C-23, pp. 90– 93, Jan. 1974. Y. Arai, T. Agui, M. Nakajima, “A Fast DCT-SQ Scheme for Images”, Trans. of the IEICE. E 71(11): 1095(Nov. 1988). E. Feig, S. T. Winograd, “Fast Algorithms for Discrete Cosine Transform”, IEEE Trans. Signal Proc. , 40, 2174 -2193 (1992). H. B. Barlow and J. D. Mollon, The Senses. Cambridge: Cambridge University Press, 1982. G. Bjontegaard “Objective simulation results”, Document VCEG-M 34, Video Coding Experts Group (VCEG), Thirteenth Meeting: Austin, Texas, USA, 2 -4 April, 2001 H. Everett III, “Generalized Lagrangian Multiplier Method for Solving Problems of Optimum Allocation of Resources, ” Operations Research, vol. 11, pp. 399 -417, 1963. B. Farber and K. Zeger, “Quantization of Multiple Sources Using Integer Bit Allocation" Data Compression Conference (DCC) Salt Lake City, Utah, March 2005 (to appear). (c) 2008 Michael Horowitz
References (continued) • • • M. Flierl, T. Wiegand, B. Girod, “Locally Optimal Design Algorithm for Block-Based Multi-Hypothesis Motion-Compensated Prediction, ” Proc. of the IEEE Data Compression Conference (DCC'98), pp. 239 -248, Snowbird, USA, Apr. 1998. A. Gersho and R. M. Gray, Vector Quantization and Signal Compression, Kluwer Academic Publishers, Boston, 1992. B. Girod, Lecture for EE 368 b, Video and Image Compression Stanford University. R. M. Gray and D. L. Neuhoff, "Quantization, " IEEE Transactions on Information Theory, vol. 44, pp. 2325 -2384, Oct. 1998. R. M. Haralick, “A Storage Efficient Way to Implement the Discrete Cosine Transform”, IEEE Transactions on Computers, 25 (6) (1976) 764– 765. H. Hedberg, and P. Nilsson, “A Survey of Various Discrete Transforms used in Digital Image Compression Algorithms, ” Proceedings of the Swedish System-On-Chip Conference 2004, Bastad, Sweden, April 13 -14, 2004. (c) 2008 Michael Horowitz
References (continued) • • • M. J. Horowitz, “Demonstration of H. 263++ Annex U Performance”, Document Q 15 -J 11, Tenth Meeting (Meeting J) of the ITU-T Q. 15/16, Advanced Video Coding Experts Group, Osaka, Japan, 16 -18 May, 2000. J. -Y. Huang and P. M. Schultheiss, “Block quantization of correlated Gaussian randomvariables, ” IEEE Trans. Comm. , vol. 11, pp. 289– 296, September 1963. F. Kossentini, Y. Lee, M. Smith and R. Ward, “Predictive RD Optimized Motion Estimation for Very Low Bit-Rate Video Coding”, Special Issue of the IEEE Journal on Selected Areas in Communications, 15(9), pages 1752 -1763, December 1997. H. P. Kramer and M. V. Mathews, “A linear coding for transmitting a set of correlated signals, ” IRE Trans. Inform. Theory, vol. 23, no. 3, pp. 41 -46, Sept. 1956. M. T. Orchard and G. J. Sullivan, “Overlapped block motion compensation: An estimation-theoretic approach, ” IEEE Trans. Image Processing, vol. 3, no. 9, pp. 693699, Sept. 1994. W. B. Pennebaker, J. L. Mitchell, JPEG, p-53, Kluwer Academic Publishers, Norwell, MA, USA 1992. (c) 2008 Michael Horowitz
References (continued) • • • A. Segall, “Bit allocation and encoding for vector sources, ” IEEE Trans. Inform. Theory IT-22 (March 1976) 162 -169. Y. Shoham and A. Gersho, “Efficient Bit Allocation for an Arbitrary Set of Quantizers, " IEEE Trans. on Acoust. , Speech, Signal Processing, vol. 36, no. 9, pp. 1445 -1453. September 1988. B. D. Tseng and W. C. Miller, “On Computing the Discrete Cosine Transform”, IEEE Transactions on Computers, 27 (10), (1978) 966– 968. Y. Wang, “Video Coding Standards”, lecture slides based on text Video Processing and Communications, Prentice Hall, 2002. A. B. Watson, “DCT quantization matrices visually optimized for individual images, ” Proc. SPIE, 1913: 202 -16, 1993. T. Wiegand, X. Zhang, and B. Girod, “Block-Based Hybrid Video Coding Using Motion-Compensated Long-Term Memory Prediction, ” in Proc. of the Picture Coding Symposium, Berlin, Germany, pp. 153 -158, Sept. 1997. (c) 2008 Michael Horowitz
Backup slides • Little Things Big Difference • Motion search over picture boundary – Reference Gisle’s Austin Contribution (c) 2008 Michael Horowitz
DCT from the DFT [Haralick 1976] • N-point DCT • Extend N-point sequence xk by reflection (c) 2008 Michael Horowitz
Extend N-point Sequence xk by Reflection • Example N 2 N (c) 2008 Michael Horowitz
Compute 2 N-point DFT • Second sum equals (by symmetry of xk) (c) 2008 Michael Horowitz
Compute 2 N-point DFT (continued) • It follows that • Multiply by & employ Euler’s formula (c) 2008 Michael Horowitz
Compute 2 N-point DFT (continued) • Recognizing the DCT • Note is even and real fm = Re{fm} • It follows that [Tseng & Miller] (i. e. Im{fm} = 0) • First N coeffs of 2 N-point DFT N-point DCT – with appropriate scaling and xk symmetry (c) 2008 Michael Horowitz
Energy Compaction of Some Discrete Transforms • Transform coefficient variances for N=16, ρ=0. 95 [Ahmed 1974] (c) 2008 Michael Horowitz
KLT Computational Complexity • Transform is signal dependent • Construct transform – Compute correlation matrix for input vector – Find eigenvectors of correlation matrix • Apply transform (c) 2008 Michael Horowitz
Multiple Motion Vectors per MB • One motion vector for each sub-block • H. 264 results [Bjontegaard] (c) 2008 Michael Horowitz
Practical Matters • 16 -bit math for 4 x 4 in H. 264 complexity reduction on certain platforms • 4 x 4 and 8 x 8 transforms in H. 264 – Exact inverses • Non-exact specification for inverse DCT – How is it done? – Implications (c) 2008 Michael Horowitz
Overlapped Block Motion Compensation in H. 263 • Coding efficiency PSNR [d. B] – Baseline vs advanced prediction mode [from Girod] (c) 2008 Michael Horowitz
1 -D DFT Energy Compaction Analysis • Fourier transform of ramp (continuous both domains) • Sample Fourier domain Fourier Series Repetition in time domain Amplitude Time (c) 2008 Michael Horowitz
Ramp: First 5 Fourier Terms [ptolemy. eecs. berkeley. edu/eecs 20/week 8/examples. html] • Fourier term decay rate: (c) 2008 Michael Horowitz
Better Energy Compaction • DFT energy compaction not very good • Better energy compacting Fourier based transforms exist • Consider DFT of extended sequence – Extend input to force even symmetry – Leads to DCT (c) 2008 Michael Horowitz
Extended Ramp (Triangle) • 2 N-point extended ramp Amplitude Time • Sample Fourier Domain Fourier Series – No discontinuities at boundary (symmetrical extension) – Expect better energy compaction (c) 2008 Michael Horowitz
Triangle: First 5 Fourier Terms [ptolemy. eecs. berkeley. edu/eecs 20/week 8/examples. html] • Fourier term decay rate: (c) 2008 Michael Horowitz
Compaction Comparison Summary • DFT coefficient amplitude decay – Ramp – Extended ramp • Suggests DCT will compact well • Fourier Series DFT – Sampling in time repetition in frequency – “series-based” observations valid for DFT (c) 2008 Michael Horowitz
Contrast Sensitivity • Allen B. Poirson & Brian A. Wandell, Pattern-color separable pathways predict sensitivity to simple colored patterns, Vision Research, 1995 (c) 2008 Michael Horowitz
Uniform Scalar Quantization • Distortion of ith cell 0 th Cell 0 – Assume high rate (c) 2008 Michael Horowitz
- Uncompensated compensated and partially compensated
- Motion compensated platform
- Phys 172
- Azure web role vs worker role
- Meter out flow control symbol
- Abg levels
- Classes of shock
- Disadvantages of dsc
- Fully compensated abg
- Raas
- Compensated demand function
- Compensated linear vector dipole
- Obstructive shock
- Stages of cirrhosis of the liver
- Krappmann schaubild
- Statuses and their related roles determine
- Passive rom vs active rom
- What is the formula of simple harmonic motion
- An object in motion stays in motion
- Chapter 2 motion section 1 describing motion answer key
- Describing and measuring motion
- Section 1 describing motion worksheet answer key
- Describing motion lesson 1 answer key
- Section 1 describing motion
- Make a prediction about kenny and franchesca
- What is inferring
- Paspc
- Difference between prediction and forecasting
- Branch prediction techniques
- Gene prediction in prokaryotes and eukaryotes
- Making predictions with modals and adverbs of certainty
- Championship branch prediction
- Corner prediction
- Hunger games questions by chapter
- Phd secondary structure prediction
- Standard error of prediction
- Meritsprediction
- Singkong prediction
- Good readers make prediction by
- How to make a prediction on a scatter plot
- Fb24 prediction
- Proyeksi peta dalam kartografi diperlukan untuk
- Hypothesis vs prediction
- Branch prediction
- Avoiding discrimination through causal reasoning
- Masta prediction
- Branch prediction in computer architecture
- Branch prediction in computer architecture
- Branch prediction
- Punnett square eye color hazel
- Standard error of prediction
- Explain an instruction issue algorithm of pentium processor
- How to calculate sst in regression
- Prediction%20cfa
- What is branch prediction logic
- Tournament branch predictor
- Mathematical models for impact prediction
- Prediction error method
- Branch prediction in computer architecture
- Predicting pip
- Weisfeiler-lehman neural machine for link prediction
- Peng cui tsinghua
- A testable prediction
- Prediction vs inference venn diagram
- Supper prediction
- Perceptual linear prediction
- Reading vipers vocabulary
- Gps raim prediction japan
- Hypothesis vs prediction
- Ebk regression prediction
- Freedico prediction
- Baseline logic aba