CS 5321 Numerical Optimization 10 9152020 Least Squares

  • Slides: 12
Download presentation
CS 5321 Numerical Optimization 10 9/15/2020 Least Squares Problem 1

CS 5321 Numerical Optimization 10 9/15/2020 Least Squares Problem 1

Least-squares problems l Linear least-squares problems l l QR method Nonlinear least-squares problems l

Least-squares problems l Linear least-squares problems l l QR method Nonlinear least-squares problems l l 9/15/2020 Gradient and Hessian of nonlinear LSP Gauss--Newton method Levenberg--Marquardt method Methods for large residual problem 2

Example of linear least square l y = β 1 + β 2 x

Example of linear least square l y = β 1 + β 2 x (from Wikipedia) l β 1 = 3. 5, β 2 = 1. 4. The line is y = 3. 5 + 1. 4 x 9/15/2020 3

Linear least squares problems l l l A linear least-squares problem is f (x)=1/2||Ax−y||2.

Linear least squares problems l l l A linear least-squares problem is f (x)=1/2||Ax−y||2. It’s gradient is f (x)=AT(Ax−y) The optimal solution is at f (x)=0, ATAx=ATy l l ATAx = ATy is called the normal equation. Perform QR decomposition on matrix A = QR. A T Ax = R T QT QRx = R T QT y l 9/15/2020 RT is invertible. The solution x = R− 1 QTy. 4

Example of nonlinear LS 2 = Á(x; t) l x 1 + tx 2

Example of nonlinear LS 2 = Á(x; t) l x 1 + tx 2 + t x 3 + x 4 e¡ x 5 t 1 Xm [Á(x; t j ) ¡ yj ]2 Find (x 1, x 2, x 3, x 4, x 5) to minimize 2 = 9/15/2020 j 1 5

Gradient and Hessian of LSP l The object function of least squares problem is

Gradient and Hessian of LSP l The object function of least squares problem is m X 1 = f (x ) r j 2 (x ) where r are n variable functions. 2 0 1 0 i j=1 B B R(x) = B @ l Define r 1 (x) r 2 (x). . . r m (x) l Gradient Hessian 9/15/2020 r f (x) = C C C A 1 r r 1 B r r (x) T 2 B J (x) = B. . @ Jacobian. (x) T The Xm C C C A r r m (x) T r j (x)r r j (x) = J (x) T R(x) j=1 r 2 f (x) = J (x) T J (x) + Xm r j (x)r 2 r j (x) T j=1 6

Gauss-Newton method Xm r 2 f (x) = J (x) T J (x) +

Gauss-Newton method Xm r 2 f (x) = J (x) T J (x) + r j (x)r 2 r j (x) T j=1 l Gauss-Newton uses the Hessian approximation r 2 f (x) ¼ J (x) T J (x) l l It’s a good approximation if ||R|| is small. This is the matrix of the normal equation Usually with the line search technique 1 1 Xm Replace 9/15/2020 f (x ) = 2 r j 2 (x ) j=1 with f (x) = 2 k. J p + Rk 2 7

Convergence of Gauss-Newton l Suppose each rj is Lipschitz continuously differentiable in a neighborhood

Convergence of Gauss-Newton l Suppose each rj is Lipschitz continuously differentiable in a neighborhood N of {x|f(x) f(x 0)} and the Jacobians J(x) satisfy ||J(x)z|| ||z||. Then the Gauss-Newton method, with k that satisfies the Wolfe conditions, has lim J k. T R k = 0 k! 1 9/15/2020 8

Levenberg-Marquardt method l l Gauss-Newton + trust region The problem becomes 1 min k.

Levenberg-Marquardt method l l Gauss-Newton + trust region The problem becomes 1 min k. J p + Rk 2 subject to || p || k p 2 Optimal condition: (recall that in chap 4) (J T J + ¸ I )p = ¸ (¢ ¡ kpk) = ¡ JTR 0 Equivalent linear ° µ least-square ¶ µproblem ¶° 9/15/2020 1 °° min ° p 2 p. J ¸I p+ R 0 ° 2 ° ° 9

Convergence of Levenberg. Marquardt l Suppose L={x | f (x) f (x 0)} is

Convergence of Levenberg. Marquardt l Suppose L={x | f (x) f (x 0)} is bounded and each rj is Lipschitz continuously differentiable in a neighborhood N of L. Assume for each k, the approximation solution pk of the Levenbergµ ¶ Marquardt method satisfies the inequality Tr k k. J m k (0) ¡ m k (pk ) ¸ c 1 k. J k. T r k k min ¢ k ; k k k. J T J k k k for some constant c 1>0, and ||pk|| k for some >1. T = lim J k R k 0 Then k ! 1 9/15/2020 10

Large residual problem Xm r 2 f (x) = J (x) T J (x)

Large residual problem Xm r 2 f (x) = J (x) T J (x) + r j (x)r 2 r j (x) T j=1 l When the second term of the Hessian is large l l Use quasi-Newton to approximate the second term The secant equation of 2 rj(x) is (B j )(x k + 1 ¡ x k ) = r r j (x k + 1 ) ¡ r r j (x k ) l 9/15/2020 The secant equation of the second term and the update formula (next slide) 11

Sk + 1 (x k + 1 ¡ x k ) = Xm r

Sk + 1 (x k + 1 ¡ x k ) = Xm r j (x k + 1 )(B j ) k + 1 (x k + 1 ¡ x k ) j=1 = Xm r j (x k + 1 )[r r j (x k + 1 ) ¡ r r j (x k )] j=1 = J k. T+ 1 R k + 1 ¡ J k. T R k + 1 Dennis, Gay, Welsch update formula. Sk + 1 (z ¡ Sk s)y. T + y(z ¡ Sk s) T s T = Sk + ¡ yy (y. T s) 2 y. T s s = y = z = 9/15/2020 xk + 1 ¡ xk J k. T+ 1 r k + 1 ¡ J k. T r k + 1 12