Optimal Routing Data Network second edition Dimitri Bertsekas

Optimal Routing Data Network second edition. Dimitri Bertsekas Robert Gallager Presenter: Y. P. Tsai 2020/12/7 OPLAB IM NTU 1

Agenda • Introduction • Performance Models • Problem Formulation • Characterization of Optimal Routing • Feasible Direction Methods for Optimal Routing • The Frank-Wolfe (Flow Deviation) Method • Gradient Projection Methods for Optimal Routing 2020/12/7 OPLAB IM NTU 2

Introduction • The Complexity of routing is due to a number of reasons: – Routing requires coordination between all the nodes of the subnet rather than just a pair of modules. – The routing system must cope with link and node failures. – The routing algorithm may need to modify its routes when some areas within the network become congested. • The two main functions performed by a routing algorithm: – Selection of routes for various OD pairs and delivery of messages to their correct destination. – Conceptually straightforward using a variety of protocols and data structure (known as routing tables) • The focus will be on the first function. 2020/12/7 OPLAB IM NTU 3

Introduction • Two main performance measures: – Throughput (quantity of service) – Average packet delay (quality of service) • Throughput = offered load – rejected load Delay Offered load Flow Control Throughput Routing (feedback) Rejected load 2020/12/7 OPLAB IM NTU 4

Introduction • The traffic accepted into the network will experience an average delay per packet that will depend on the routes chosen by the routing algorithm. • However, throughput will also be greatly affected. ex: M/M/1 Queue 0 2020/12/7 OPLAB IM NTU 5

Introduction • As the routing algorithm is more successful in keeping delay low, the flow control algorithm allows more traffic into the network. Delay Throughput 2020/12/7 OPLAB IM NTU 6

Introduction • Shortest Path routing has two drawbacks: – Shortest path uses only one path per OD pair, thereby potentially limiting the throughput of the network. – Its capability to adapt to changing traffic conditions is limited by its susceptibility to oscillations. • Optimal Routing: – Based on the optimization of an average delay-like measure of performance. – Eliminate both of these disadvantages by splitting any OD pair traffic at strategic points, and by shifting traffic gradually mathematical theory of optimal multi-commodity flows. 2020/12/7 OPLAB IM NTU 7

Agenda • Introduction • Performance Models • Problem Formulation • Characterization of Optimal Routing • Feasible Direction Methods for Optimal Routing • The Frank-Wolfe (Flow Deviation) Method • Gradient Projection Methods for Optimal Routing 2020/12/7 OPLAB IM NTU 8

Performance Models • To evaluate the performance of a routing algorithm, we need to quantify the notion of traffic congestion. • In this section we formulate performance models based on the traffic arrival rates at the network links. – these models, called flow models • Traffic congestion in a data network can be quantified in terms of the statistics of the arrival processes of the network queues. – distributions of queue length and packet waiting time at each link. – desirable routing is associated with “a small mean” or “variances” of the queue length in a data network. • Unfortunately, there is usually no accurate analytical expression for the means or variances of the queue lengths. 2020/12/7 OPLAB IM NTU 9

Performance Models • A convenient but somewhat imperfect alternative is to measure congestion at a link in terms of the average traffic carried by the link. • More precisely, we assume that the statistics of the arrival process at each link change due only to routing updates. • And that we measure congestion on via the traffic arrival rate. We call (data units/sec) the flow of link. – large number of users for each OD pair – the traffic rate of each of these users being small relative to the total rate. 2020/12/7 OPLAB IM NTU 10

Performance Models • an expression of the cost function (5. 29) • each function is monotonically increasing. • A frequently used formula is: (5. 30) 2020/12/7 OPLAB IM NTU 11

Performance Models • Then the cost function (5. 29) becomes the average number of packets in the system base on the hypothesis that each queue behaves as an M/M/1 queue of packets. • (5. 29) and (5. 30) expresses qualitatively that congestion sets in when a flow approaches the corresponding link capacity. • Another cost function with similar qualitative properties is given by: • This indicates that one should employ the cost function that is easiest to optimize. 2020/12/7 OPLAB IM NTU 12

Agenda • Introduction • Performance Models • Problem Formulation • Characterization of Optimal Routing • Feasible Direction Methods for Optimal Routing • The Frank-Wolfe (Flow Deviation) Method • Gradient Projection Methods for Optimal Routing 2020/12/7 OPLAB IM NTU 13

Problem Formulation • For each pair w = (i, j) of distinct nodes i and j, the input traffic arrival process is assumed stationary with rate • The routing objective is to divide each among the many paths from origin to destination in a way that the resulting total link flow pattern minimizes the cost function (5. 29) 2020/12/7 OPLAB IM NTU 14

Problem Formulation • Then the collection of all path flows satisfy the constraints must • The total flow of link (i, j) is the sum of all path flows traversing the link 2020/12/7 OPLAB IM NTU 15

origin for OD pair w 2 . . . . origin for OD pair w 1 destination for OD pair w 1. . . . Network destination for OD pair w 2 2020/12/7 OPLAB IM NTU 16

Problem Formulation • By expressing the total flows in terms of the path flows in the cost function (5. . 29) , the problem can be written as (5. 33) 2020/12/7 OPLAB IM NTU 17

Agenda • Introduction • Performance Models • Problem Formulation • Characterization of Optimal Routing • Feasible Direction Methods for Optimal Routing • The Frank-Wolfe (Flow Deviation) Method • Gradient Projection Methods for Optimal Routing 2020/12/7 OPLAB IM NTU 18

Characterization of Optimal Routing • Denote by D(x) the cost function of the problem of Eq. (5. 33) • By partial derivative of D with respect to . then • Consequently, in what follows derivative length of path p is called the first 2020/12/7 OPLAB IM NTU 19

Characterization of Optimal Routing • Let be an optimal path flow vector. Then if >0 for some path p of an OD pair w, we must be able to shift a small δ > 0 from path p to any other path p’ of the same OD pair without improving the cost • The change in cost from this shift is and since this change must be nonnegative, we obtain 2020/12/7 OPLAB IM NTU 20

Characterization of Optimal Routing • In words, optimal path flow is positive only on paths with a MFDL (minimum first derivative length). • Furthermore, at an optimum, the paths along which the input flow of OD pair w is split must have equal length. 2020/12/7 OPLAB IM NTU 21

Agenda • Introduction • Performance Models • Problem Formulation • Characterization of Optimal Routing • Feasible Direction Methods for Optimal Routing • The Frank-Wolfe (Flow Deviation) Method • Projection Methods for Optimal Routing 2020/12/7 OPLAB IM NTU 22

Feasible Direction Methods for Optimal Routing • A set of path flows is strictly suboptimal only if there is a positive amount of flow that travels on a non-MFDL path. • This suggests that suboptimal routing can be improved by shifting flow to an MFDL path. • The adaptive shortest path method does that in a sense, but shifts all flow of each OD pair to the shortest path, with oscillatory behavior resulting. • It is more appropriate to shift only part of the flow of other paths to the shortest path. 2020/12/7 OPLAB IM NTU 23

Feasible Direction Methods for Optimal Routing • Given a feasible path flow vector x along a direction. , consider changing • There are two requirements imposed on the direction 1. ∆x should be a feasible direction in the sense that small change along ∆x maintain the feasibility of the path flow vector x x Feasible directions ∆x at x Constraint set X 2020/12/7 OPLAB IM NTU 24

Feasible Direction Methods for Optimal Routing • Mathematically, it is required that for some -----, the vector is feasible or equivalently, (5. 57) (5. 58) • Equation (5. 57) follows from the feasibility requirement • and the fact that x is feasible, which implies that • One way to obtain feasible directions is to select another feasible vector 2020/12/7 OPLAB IM NTU 25

Feasible Direction Methods for Optimal Routing 2. ∆x should be a descent direction in the sense that the cost function can be decreased by making small movements along the direction ∆x starting from x. Feasible descent directions at x Constraint set X x 2020/12/7 OPLAB IM NTU Surfaces of equal cost D(x) 26

Feasible Direction Methods for Optimal Routing • Since the gradient vector is normal to the equal cost surfaces of the cost function D, it is clear that the descent condition translates to the condition that the inner product of and ∆x is negative, that is (5. 59) 2020/12/7 OPLAB IM NTU 27

Feasible Direction Methods for Optimal Routing • One way to satisfy the descent condition of Eq. (5. 59), which is in fact commonly used in algorithms, is to require that ∆x satisfies the conservation of flow condition Eq. (5. 57), and that (5. 60) 2020/12/7 OPLAB IM NTU 28

Feasible Direction Methods for Optimal Routing • We thus obtain a broad class of iterative algorithms for solving the optimal routing problem. The basic iteration is given by: • Where ∆x is a feasible descent direction and α is a positive stepsize chosen so that the cost function is decreased. 2020/12/7 OPLAB IM NTU 29

Agenda • Introduction • Performance Models • Problem Formulation • Characterization of Optimal Routing • Feasible Direction Methods for Optimal Routing • The Frank-Wolfe (Flow Deviation) Method • Gradient Projection Methods for Optimal Routing 2020/12/7 OPLAB IM NTU 30

The Frank-Wolfe (Flow Deviation) Method Given a feasible path flow vector length (MFDL) path for each OD pair. , find a minimum first derivative Let be the vector of path flows that would result if all input each OD pair is routed along the corresponding MFDL path. 2020/12/7 OPLAB IM NTU for 31

The Frank-Wolfe (Flow Deviation) Method Let be the stepsize that minimizes that is, over all , The new set of path flows is obtained by: and the process is repeated. 2020/12/7 OPLAB IM NTU 32

The Frank-Wolfe (Flow Deviation) Method • The characteristic property here is that flow is shifted from the non-shortest path in equal proportions. • This distinguishes the Frank-Wolfe method from the gradient projection methods discussed in the next section. 2020/12/7 OPLAB IM NTU 33

An example • Consider the three-link network with one origin and one destination: r=1 Origin 2020/12/7 Destination OPLAB IM NTU 34

An example • There are three paths with flows satisfy the constraints , which must (0, 0, 1) Constraint set (1, 0, 0) 2020/12/7 (0, 1, 0) OPLAB IM NTU 35

An example • The cost function is (0, 0, 1) (1, 0, 0) 2020/12/7 (0, 1, 0) OPLAB IM NTU 36

An example • This is an easy problem that can be solved analytically. It can be argued that at an optimal solution , we must have by symmetry. (0, 0, 1) (1, 0, 0) 2020/12/7 (0, 1, 0) OPLAB IM NTU 37

An example • So there are two possibilities a) b) • Case (b) is not possible because according to the optimality condition discussed before, the length of path 3 must be less or equal to the lengths of paths 1 and 2, which is clearly false. • Therefore the optimal solution is 2020/12/7 OPLAB IM NTU 38

An example • It is seen that the shortest path is either 1 or 2, depending on whether , • and the corresponding shortest-path flows are • Therefore, the Frank-Wolfe iteration takes the form 2020/12/7 OPLAB IM NTU 39

An example • Slow convergence is due to the fact that as the optimal solution is approach, the directions of search tend to become orthogonal to the direction leading to the optimum. (0, 0, 1) (1, 0, 0) 2020/12/7 (0, 1, 0) OPLAB IM NTU 40

An example Constraint set 2020/12/7 OPLAB IM NTU 41

An example • The stepsize is obtained by line minimization over [0, 1]. • Because D(x) quadratic, this minimization can be done analytically. • Differentiate with respect to α and setting the derivative to zero we obtained minimum of over α. • A simple method is to choose the stepsize by means of obtained by making a second-Taylor approximation of 2020/12/7 OPLAB IM NTU 42

The Frank-Wolfe (Flow Deviation) Method • An important situation where the Frank-Wolfe method has an advantage : – Suppose that one is not interested in obtaining optimal path flows, but only interest in the optimal total link flows or just the value of optimal cost. • The amount of storage required in this implementation is relatively small, thereby allowing the solution of very large network problems. – current total link flows – current shortest paths for all OD pairs 2020/12/7 OPLAB IM NTU 43

Agenda • Introduction • Performance Models • Problem Formulation • Characterization of Optimal Routing • Feasible Direction Methods for Optimal Routing • The Frank-Wolfe (Flow Deviation) Method • Gradient Projection Methods for Optimal Routing 2020/12/7 OPLAB IM NTU 44

Gradient Projection Methods for Optimal Routing • A class of feasible direction algorithms that are faster than the Frank-Wolfe method and lend themselves more readily to distributed implementation. • These methods are also based on shortest paths and determine a MFDL path for every OD pair at each iteration. • An increment of flow change is calculated for each path on the basis of the relative magnitudes of the path lengths and, sometimes, the second derivatives of the cost function. 2020/12/7 OPLAB IM NTU 45

Gradient Projection Methods for Optimal Routing • Consider the optimal routing problem (5. 77) • Assume that the second derivatives of , are positive for all. , denoted by • Let be the path flow vector obtained after k iterations and let be the corresponding set of total link flows. 2020/12/7 OPLAB IM NTU 46

Gradient Projection Methods for Optimal Routing • For each OD pair w, let be an MFDL path. • The optimal routing problem (5. 77) be converted to a problem involving only positivity constraints while eliminating the equality constraints For each w, the equation is substituted in the cost function D(x) using (5. 78) 2020/12/7 OPLAB IM NTU 47

Gradient Projection Methods for Optimal Routing thereby obtaining a problem of the form (5. 79) where paths. 2020/12/7 is the vector of all path flows which are not MFDL OPLAB IM NTU 48

Gradient Projection Methods for Optimal Routing • We now calculate the derivatives that will be needed to apply the scaled projection iteration to the problem of Eq. (5. 79). • Using (5. 78) and the definition of , we obtain (5. 80) (5. 81) (5. 82) 2020/12/7 OPLAB IM NTU 49

Gradient Projection Methods for Optimal Routing • The iteration takes the form (5. 83) where and are the first derivative lengths of the paths p and given by (5. 84) 2020/12/7 OPLAB IM NTU 50

Gradient Projection Methods for Optimal Routing • and is the “second derivative length” (5. 85) • The stepsize is some positive scalar which may be chosen by a variety of methods. For example, can be chosen by some form of line minimization. • Eqs. (5. 83) to (5. 85) as projection algorithm. 2020/12/7 OPLAB IM NTU 51

(0, 0, 1) Constraint set (1, 0, 0) 2020/12/7 (0, 1, 0) OPLAB IM NTU 52

Gradient Projection Methods for Optimal Routing • The projection algorithm typically yields rapid convergence to a neighborhood of an optimal solution. • Once it comes near a solution, it tends to slow down. • Its progress is often satisfactory near a solution and usually far better than that of the Frank-Wolfe method. • To obtain faster convergence near an optimal solution, it is necessary to modify the projection algorithm so that the offdiagonal terms of the Hessian matrix are taken into account. 2020/12/7 OPLAB IM NTU 53

Thanks for listening 2020/12/7 OPLAB IM NTU 54