ITERATED SRINKAGE ALGORITHM FOR BASIS PURSUIT MINIMIZATION Michael

ITERATED SRINKAGE ALGORITHM FOR * BASIS PURSUIT MINIMIZATION Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel SIAM Conference on Imaging Science May 15 -17, 2005 – Minneapolis, Minnesota Sparse Representations – Theory and Applications in Image Processing * Joint work with Michael Zibulevsky and Boaz Matalon Iterated Shrinkage Algorithm for Basis Pursuit Minimization

Noise Removal Our story begins with signal/image denoising … ? Remove Additive Noise q 100 years of activity – numerous algorithms. q Considered Directions include: PDE, statistical estimators, adaptive filters, inverse problems & regularization, examplebased restoration, sparse representations, … Iterated Shrinkage Algorithm for Basis Pursuit Minimization 2

Shrinkage For Denoising q Shrinkage is a simple yet effective sparsity-based denoising algorithm [Donoho & Johnstone, 1993]. q Justification 1: minimax near-optimal over the Besov (smoothness) signal space (complicated!!!!). LUT Apply Wavelet Transform q Justification 2: Bayesian (MAP) optimal Apply Inv. Wavelet Transform [Simoncelli & Adelson 1996, Moulin & Liu 1999]. q In both justifications, an additive Gaussian white noise and a unitary transform are crucial assumptions for the optimality claims. Iterated Shrinkage Algorithm for Basis Pursuit Minimization 3

Redundant Transforms? LUT Apply Redundant Transform Apply its (pseudo) Inverse Transform Number of coefficients is (much!) greater than the q This scheme is still applicable, and it works fine (tested with curvelet, number of undecimated input samples wavelet, and more). contourlet, (pixels) q However, it is no longer the optimal solution for the MAP criterion. FOCUS: WE SHOW THAT THE TODAY’S ABOVE SHRINKAGE METHOD IS THE FIRST ITERATIONSTILL IN A VERY EFFECTIVE SIMPLE IS SHRINKAGE RELEVANT WHENAND HANDLING ALGORITHM THAT(OR MINIMIZES THE BASIS PURSUIT, AND REDUNDANT NON-UNITARY) TRANSFORMS? AS SUCH, IT IS A HOW? NEW PURSUIT WHY? TECHNIQUE. Iterated Shrinkage Algorithm for Basis Pursuit Minimization 4

Agenda 1. Bayesian Point of View – a Unitary Transform Optimality of shrinkage 2. What About Redundant Representation? Is shrinkage is relevant? Why? How? 3. Conclusions Thomas Bayes 1702 - 1761 Iterated Shrinkage Algorithm for Basis Pursuit Minimization 5

The MAP Approach Minimize the following function with respect to x: Log-Likelihood term Unknown to be recovered Iterated Shrinkage Algorithm for Basis Pursuit Minimization Prior or regularization Given measurements 6

Image Prior? During the past several decades we have made all sort of guesses about the prior Pr(x): Energy Smoothness Adapt+ Smooth Robust Statistics • Mumford & Shah formulation, • Compression algorithms as priors, Total. Variation Iterated Shrinkage Algorithm for Basis Pursuit Minimization Wavelet Sparse & Today’s Focus Sparsity Redundant • … 7

(Unitary) Wavelet Sparsity Define L 2 is unitarily invariant We got a separable set of 1 D optimization problems Iterated Shrinkage Algorithm for Basis Pursuit Minimization 8

Why Shrinkage? Want to minimize this 1 -D function with respect to z LUT zopt a A LUT can be built for any other robust function (replacing the |z|), including nonconvex ones (e. g. , L 0 norm)!! Iterated Shrinkage Algorithm for Basis Pursuit Minimization 9

Agenda n 1. Bayesian Point of View – a Unitary Transform Optimality of shrinkage 2. What About Redundant Representation? Is shrinkage is relevant? Why? How? k 3. Conclusions Iterated Shrinkage Algorithm for Basis Pursuit Minimization 10

An Overcomplete Transform T x= == Redundant transforms are important because they can (i) Lead to a shift-invariance property, (ii) Represent images better (because of orientation/scale analysis), (iii) Enable deeper sparsity (when used in conjunction with the BP). Iterated Shrinkage Algorithm for Basis Pursuit Minimization 11

Analysis versus Synthesis Analysis Prior: Define Synthesis Prior: Basis Pursuit However Iterated Shrinkage Algorithm for Basis Pursuit Minimization 12

Basis Pursuit As Objective Our Objective: D -y = - Getting a sparse solution implies that y is composed of few atoms from D Iterated Shrinkage Algorithm for Basis Pursuit Minimization 13

Sequential Coordinate Descent Our objective Set j=1 q The unknown, , has k entries. q How about optimizing w. r. t. each of them sequentially? q The objective per each becomes Iterated Shrinkage Algorithm for Basis Pursuit Minimization Fix all entries of apart from the j-th one Optimize with respect to j j=j+1 mod k 14

We Get Sequential Shrinkage BEFORE: We had this 1 -D function to minimize and the solution was NOW: Our 1 -D objective is and the solution now is Iterated Shrinkage Algorithm for Basis Pursuit Minimization 15

Sequential? Not Good!! Set j=1 Fix all entries of apart from the j-th one Optimize with respect to j j=j+1 mod k Iterated Shrinkage Algorithm for Basis Pursuit Minimization q This method requires drawing one column at a time from D. q In most transforms this is not comfortable at all !!! q This also means that MP and its variants are inadequate. 16

How About Parallel Shrinkage? Our objective For j=1: k q Assume a current solution n. q Using the previous method, we have k descent directions obtained by a simple shrinkage. Compute the descent direction per j : vj. Update the solution by q How about taking all of them at once, with a proper relaxation? q Little bit of math lead to … Iterated Shrinkage Algorithm for Basis Pursuit Minimization 17

Parallel Coordinate Descent (PCD) Back-projection to the signal domain Shrinkage operation The synthesis error Normalize by a diagonal matrix Update by exact line-search Iterated Shrinkage Algorithm for Basis Pursuit Minimization At all stages, the dictionary is applied as a whole, either directly, or via its adjoint 18

PCD – The First Iteration Assume: Zero initialization D is a tight frame with normalized columns (Q=I) Line search is replaced with Iterated Shrinkage Algorithm for Basis Pursuit Minimization 19

Relation to Simple Shrinkage? LUT Apply Redundant Transform Apply its (pseudo) Inverse Transform The first iteration in our algorithm = the intuitive shrinkage !!! Iterated Shrinkage Algorithm for Basis Pursuit Minimization 20

PCD – Convergence Analysis q We have proven convergence to the global minimizer of the BPDN objective function (with smoothing): q Approximate asymptotic convergence rate analysis yields: where M and m are the largest and smallest eigenvalues of respectively (H is the Hessian). q This rate equals that of the Steepest-Descent algorithm, preconditioned by the Hessian’s diagonal. q Substantial further speed-up can be obtained using the subspace optimization algorithm (SESOP) [Zibulevsky and Narkis 2004]. Iterated Shrinkage Algorithm for Basis Pursuit Minimization 21

Image Denoising Objective function 9000 Iterative Shrinkage 8000 Steepest Descent Conjugate Gradient 7000 • The Matrix M gives a variance per each coefficient, learned from the corrupted image. • D is the contourlet transform (recent version). • The length of : ~1 e+6. • The Seq. Shrinkage algorithm cannot be simulated for this dim. . Iterated Shrinkage Algorithm for Basis Pursuit Minimization Truncated Newton 6000 5000 4000 3000 2000 1000 0 2 4 6 8 10 12 14 16 18 Iterations 22

Image Denoising PSNR 32 31 30 29 28 27 26 Even though one iteration of our algorithm is equivalent in complexity to that of the SD, the performance is much better Iterative Shrinkage 25 Steepest Descent 24 Conjugate Gradient 23 Truncated Newton 22 0 2 4 6 8 10 12 14 16 18 Iterations Iterated Shrinkage Algorithm for Basis Pursuit Minimization 23

Image Denoising Original Image Iterated Shrinkage – First Iteration PSNR=28. 30 d. B Iterated Shrinkage Algorithm for Basis Pursuit Minimization Noisy Image with σ=20 Iterated Shrinkage – second iteration PSNR=31. 05 d. B 24

Closely Related Work q Several recent works have devised iterative shrinkage algorithms, each with a different motivation: • E-M algorithm for image deblurring [Figueiredo & Nowak 2003]. • Surrogate functionals for deblurring as above [Daubechies, Defrise, & De-Mol, 2004] and [Figueiredo & Nowak 2005]. • PCD minimization for denoising (as shown above) [Elad, 2005]. q While these algorithms are similar, they are in fact different. Our recent work have shown that: • PCD gives faster convergence, compared to the surrogate algorithms. • All the above methods can be further improved by SESOP, leading to Iterated Shrinkage Algorithm for Basis Pursuit Minimization 25

Agenda 1. Bayesian Point of View – a Unitary Transform Optimality of shrinkage 2. What About Redundant Representation? Is shrinkage is relevant? Why? How? 3. Conclusions Iterated Shrinkage Algorithm for Basis Pursuit Minimization 26

Conclusion Shrinkage is an appealing signal denoising technique Compute all the CD directions, and use the average For additive Gaussian noise and unitary transforms When optimal? How? Go Parallel Getting what? We obtain an easy to implement iterated shrinkage algorithm (PCD). This algorithm has been thoroughly studied (convergence, rate, comparisons). Iterated Shrinkage Algorithm for Basis Pursuit Minimization What if the transform is redundant? How to avoid Option 1: apply the need to sequential coordinate extract descent which leads atoms? to a sequential shrinkage algorithm 27

THANK YOU!! These slides and accompanying papers can be found in http: //www. cs. technion. ac. il/~elad Iterated Shrinkage Algorithm for Basis Pursuit Minimization 28