8 Performance Surfaces 1 8 Taylor Series Expansion

8 Taylor Series Expansion F ( x ) = F ( x* ) +

8 Example Taylor series of F(x) about x* = 0 : Taylor series approximations:

8 Directional Derivatives First derivative (slope) of F(x) along xi axis: (ith element of

Plots 8 Directional Derivatives 1. 4 1. 3 x 2 1. 0 0. 5

8 Minima Strong Minimum The point x* is a strong minimum of F(x) if

8 Scalar Example Strong Maximum Strong Minimum Global Minimum 11

8 First-Order Optimality Condition T F ( x ) = F ( x* +

8 Second-Order Condition If the first-order condition is satisfied (zero gradient), then A strong

8 Quadratic Functions (Symmetric A) Gradient and Hessian: Useful properties of gradients: Gradient of

Eigensystem of the Hessian Consider a quadratic function which has a stationary point at

8 Second Directional Derivative T T p p Ap Ñ 2 F ( x

8 Eigenvector (Largest Eigenvalue) ¼ 0 0 T T 0 1 0 ¼ c

8 Circular Hollow (Any two independent vectors in the plane would work. ) 20

8 Elongated Saddle 1 2 3 1 2 1 T F ( x )

8 Stationary Valley 1 2 1 T F( x ) = --- x 1

8 Quadratic Function Summary • If the eigenvalues of the Hessian matrix are all

Slides: 24

Download presentation

8 Performance Surfaces 1

8 Taylor Series Expansion F ( x ) = F ( x* ) + d F( x ) dx x = x* ( x – x* ) 2 1 d + --F( x) 2 d x 2 2 ( x – x* ) + ¼ x = x* n 1 d + ----F( x) n! d x n n ( x – x* ) + ¼ x = x* 2

8 Example Taylor series of F(x) about x* = 0 : Taylor series approximations: 3

8 Plot of Approximations 4

8 Directional Derivatives First derivative (slope) of F(x) along xi axis: (ith element of gradient) Second derivative (curvature) of F(x) along xi axis: (i, i element of Hessian) T First derivative (slope) of F(x) along vector p: p ÑF ( x ) -----------p Second derivative (curvature) of F(x) along vector p: T p Ñ 2 F ( x ) p ---------------2 p 7

Plots 8 Directional Derivatives 1. 4 1. 3 x 2 1. 0 0. 5 0. 0 x 2 x 1 9

8 Minima Strong Minimum The point x* is a strong minimum of F(x) if a scalar d > 0 exists, such that F(x*) < F(x* + Dx) for all Dx such that d > ||Dx|| > 0. Global Minimum The point x* is a unique global minimum of F(x) if F(x*) < F(x* + Dx) for all Dx ° 0. Weak Minimum The point x* is a weak minimum of F(x) if it is not a strong minimum, and a scalar d > 0 exists, such that F(x*) Š F(x* + Dx) for all Dx such that d > ||Dx|| > 0. 10

8 Scalar Example Strong Maximum Strong Minimum Global Minimum 11

8 Vector Example 12

8 First-Order Optimality Condition T F ( x ) = F ( x* + D x ) = F ( x* ) + Ñ F ( x ) For small Dx: If 1 T 2 + --- D x Ñ F ( x ) x D Dx + ¼ * * 2 x=x If x* is a minimum, this implies: then But this would imply that x* is not a minimum. Therefore Since this must be true for every Dx, 13

8 Second-Order Condition If the first-order condition is satisfied (zero gradient), then A strong minimum will exist at x* if for any Dx ° 0. Therefore the Hessian matrix must be positive definite. A matrix A is positive definite if: for any z ° 0. This is a sufficient condition for optimality. A necessary condition is that the Hessian matrix be positive semidefinite. A matrix A is positive semidefinite if: for any z. 14

8 Example 2 2 F( x ) = x 1 + 2 x 1 x 2 + 2 x 2 + x 1 (Not a function of x in this case. ) To test the definiteness, check the eigenvalues of the Hessian. If the eigenvalues are all greater than zero, the Hessian is positive definite. Both eigenvalues are positive, therefore strong minimum. 15

8 Quadratic Functions (Symmetric A) Gradient and Hessian: Useful properties of gradients: Gradient of Quadratic Function: Hessian of Quadratic Function: 16

Eigensystem of the Hessian Consider a quadratic function which has a stationary point at the origin, and whose value there is zero. Perform a similarity transform on the Hessian matrix, using the eigenvalues as the new basis vectors. Since the Hessian matrix is symmetric, its eigenvectors are orthogonal. l 1 0 ¼ 0 0 l 2 ¼ 0 ¼ T A' = [ B AB ] = ¼ ¼ 8 0 0 ¼ ln = L 17

8 Second Directional Derivative T T p p Ap Ñ 2 F ( x ) p --------------- = -------2 2 p p Represent p with respect to the eigenvectors (new basis): n T T T p Ap c B ( BLB ) Bc c Lc p c B Bc c c å 2 li ci i=1 -------= ---------------------= ------------------2 T T T n å 2 ci i=1 T p Ap l min £ -------£ l max 2 p 18

8 Eigenvector (Largest Eigenvalue) ¼ 0 0 T T 0 1 0 ¼ c = B p = B z max = 0 n T z max Az max å 2 li ci i=1 ---------------= ---------- = l max 2 n z max å 2 ci i=1 The eigenvalues represent curvature (second derivatives) along the eigenvectors (the principal axes). 19

8 Circular Hollow (Any two independent vectors in the plane would work. ) 20

8 Elliptical Hollow 21

8 Elongated Saddle 1 2 3 1 2 1 T F ( x ) = – --- x 1 x 2 – --- x 2 = --- x – 0. 5 – 1. 5 x 4 2 – 1. 5 – 0. 5 22

8 Stationary Valley 1 2 1 T F( x ) = --- x 1 – x 1 x 2 + --- x 2 = --- x 1 – 1 x 2 2 2 – 1 1 23

8 Quadratic Function Summary • If the eigenvalues of the Hessian matrix are all positive, the function will have a single strong minimum. • If the eigenvalues are all negative, the function will have a single strong maximum. • If some eigenvalues are positive and other eigenvalues are negative, the function will have a single saddle point. • If the eigenvalues are all nonnegative, but some eigenvalues are zero, then the function will either have a weak minimum or will have no stationary point. • If the eigenvalues are all nonpositive, but some eigenvalues are zero, then the function will either have a weak maximum or will have no stationary point. Stationary Point: 24