Online Convex Optimization Using Predictions Niangjun Chen Joint

  • Slides: 52
Download presentation
Online Convex Optimization Using Predictions Niangjun Chen Joint work with Anish Agarwal, Lachlan Andrew,

Online Convex Optimization Using Predictions Niangjun Chen Joint work with Anish Agarwal, Lachlan Andrew, Siddharth Barman, and Adam Wierman 1

 2

2

 3

3

 online convex switching cost Goal: Algorithms to minimize cost 4

online convex switching cost Goal: Algorithms to minimize cost 4

Lots of applications … Dynamic capacity management in data centers [Tu et al. 2013]

Lots of applications … Dynamic capacity management in data centers [Tu et al. 2013] Power system generation/load scheduling[Lu et al. 2013] Portfolio management [Cover 1991][Boyd et al. 2012] Video streaming [Sen et al. 2000][Liu et al. 2008] Network routing [Bansal et al. 2003][Kodialam et al. 2003] Geographical load balancing [Hindman et al. 2011] [Lin et al. 2012] … 5

In most applications, predictions are crucial But we do not have a good understanding

In most applications, predictions are crucial But we do not have a good understanding about how (imperfect) predictions impact online algorithm design 6

This talk: Online Convex Optimization Using Predictions 7

This talk: Online Convex Optimization Using Predictions 7

 8

8

 9

9

 10

10

Online convex optimization using predictions online convex switching cost Time Information Available Decision 1

Online convex optimization using predictions online convex switching cost Time Information Available Decision 1 … 2 … 3 … 4 … 11

How do algorithms model prediction noise? • Worst case analysis Average case analysis 12

How do algorithms model prediction noise? • Worst case analysis Average case analysis 12

Our contribution: a general and tractable model for prediction Key message: prediction allows 1.

Our contribution: a general and tractable model for prediction Key message: prediction allows 1. Overcoming “impossibility” results for OCO with minimal structural assumption 2. Mixture of average case and worst case analysis 13

Outline 1. Background : regret and competitive ratio OCO without prediction OCO with worst

Outline 1. Background : regret and competitive ratio OCO without prediction OCO with worst case prediction 2. Our prediction noise model 3. Algorithm design 4. OCO with stochastic prediction noise 14

Two communities, two metrics • Real applications want both 15

Two communities, two metrics • Real applications want both 15

Guarantees without prediction ØSublinear regret? Yes, [Kivinen & Vempala 2002] [Bansal et al. 2003]

Guarantees without prediction ØSublinear regret? Yes, [Kivinen & Vempala 2002] [Bansal et al. 2003] [Zinkevich 2003] [Hazan et al. 2007] [Lin et al. 2012] … ØConstant CR? Yes, but only for scalar case [Blum et al. 1992] [Borodin et al. 1992][Blum & Burch 2000] [Lin et al. 2011][Lin et al. 2012] … ØSublinear regret and constant CR? Not even in scalar case! [Andrew et al. 2013] 16

Guarantees with prediction • Yes, [Kivinen & Vempala 2002] [Bansal et al. 2003] [Zinkevich

Guarantees with prediction • Yes, [Kivinen & Vempala 2002] [Bansal et al. 2003] [Zinkevich 2003] [Hazan et al. 2007] [Lin et al. 2012] … Yes in general [Lin et al. 2013] Not without a lot of prediction [Chen et al. 2015] 17

 We may be using the wrong prediction model 18

We may be using the wrong prediction model 18

Outline 1. Background : regret and competitive ratio OCO without prediction OCO with worst

Outline 1. Background : regret and competitive ratio OCO without prediction OCO with worst case prediction 2. Our prediction noise model 3. Algorithm design 4. OCO with stochastic prediction noise 19

What do we want in a prediction noise model? Ø Predictions are “refined” as

What do we want in a prediction noise model? Ø Predictions are “refined” as time goes forward Ø Predictions are more noisy as you look further ahead Ø Prediction errors can be correlated Ø Should be general enough to incorporate detailed models 20

A more realistic prediction noise model Realization that algorithm is trying to track prediction

A more realistic prediction noise model Realization that algorithm is trying to track prediction error 21

A more realistic prediction noise model Per-step noise How much uncertainty is there one

A more realistic prediction noise model Per-step noise How much uncertainty is there one step ahead? 22

A more realistic prediction noise model Weighting factor 23

A more realistic prediction noise model Weighting factor 23

A more realistic prediction noise model prediction error 24

A more realistic prediction noise model prediction error 24

A more realistic prediction noise model prediction error This form of prediction error matches

A more realistic prediction noise model prediction error This form of prediction error matches what occurs in • Prediction of a wide-sense stationary process using a Weiner filter • Prediction of a linear dynamical system using a Kalman filter 25

A more realistic prediction noise model Allows adversarial analysis using stochastic prediction noise 26

A more realistic prediction noise model Allows adversarial analysis using stochastic prediction noise 26

Outline 1. Background : regret and competitive ratio OCO without prediction OCO with worst

Outline 1. Background : regret and competitive ratio OCO without prediction OCO with worst case prediction 2. Our prediction noise model 3. Algorithm design 4. OCO with stochastic prediction noise 27

A natural suggestion: Model Predictive Control (MPC) 28

A natural suggestion: Model Predictive Control (MPC) 28

A natural suggestion: Model Predictive Control (MPC) 29

A natural suggestion: Model Predictive Control (MPC) 29

A natural suggestion: Model Predictive Control (MPC) But MPC doesn’t work well in this

A natural suggestion: Model Predictive Control (MPC) But MPC doesn’t work well in this setting … 30

A more stable alternative: Averaging Fixed Horizon Control (AFHC) Fixed Horizon Control (FHC) 31

A more stable alternative: Averaging Fixed Horizon Control (AFHC) Fixed Horizon Control (FHC) 31

A more stable alternative: Averaging Fixed Horizon Control (AFHC) Fixed Horizon Control (FHC) 32

A more stable alternative: Averaging Fixed Horizon Control (AFHC) Fixed Horizon Control (FHC) 32

A more stable alternative: Averaging Fixed Horizon Control (AFHC) … 33

A more stable alternative: Averaging Fixed Horizon Control (AFHC) … 33

Outline 1. Background : regret and competitive ratio OCO without prediction OCO with worst

Outline 1. Background : regret and competitive ratio OCO without prediction OCO with worst case prediction 2. Our prediction noise model 3. Algorithm design 4. OCO with stochastic prediction noise 34

 35

35

 How tight is this condition? 36

How tight is this condition? 36

 How to choose w? Loss due to switching Cumulative prediction error over w

How to choose w? Loss due to switching Cumulative prediction error over w timesteps 37

 How likely is large deviation from expected performance for AFHC? 38

How likely is large deviation from expected performance for AFHC? 38

Our contribution: a general and tractable model for prediction Key message: prediction allows 1.

Our contribution: a general and tractable model for prediction Key message: prediction allows 1. Overcoming “impossibility” results for OCO with minimal structural assumption AFHC can achieve sublinear regret and constant CR 2. Balance between average case and worst case analysis Concentration of AFHC around its mean performance 39

Online Convex Optimization Using Predictions Niangjun Chen Joint work with Anish Agarwal, Lachlan Andrew,

Online Convex Optimization Using Predictions Niangjun Chen Joint work with Anish Agarwal, Lachlan Andrew, Sid Barman, and Adam Wierman 40

Backup Slides 41

Backup Slides 41

Predicting stationary process with Wiener Filter • 42

Predicting stationary process with Wiener Filter • 42

Predicting Linear Dynamical System Using Kalman Filter • 43

Predicting Linear Dynamical System Using Kalman Filter • 43

Proof Sketch 1. Within a lookahead window By perturbation analysis using Fenchel-Rockafellar duality Prediction

Proof Sketch 1. Within a lookahead window By perturbation analysis using Fenchel-Rockafellar duality Prediction error 44

Proof Sketch 2. Between lookahead windows cost between lookahead window 45

Proof Sketch 2. Between lookahead windows cost between lookahead window 45

Proof Sketch … Average choices of FHC algorithms Switching cost Prediction error 46

Proof Sketch … Average choices of FHC algorithms Switching cost Prediction error 46

Proof Sketch 4. Similarly for regret 47

Proof Sketch 4. Similarly for regret 47

 Fixed Horizon Control Averaging Fixed Horizon Control 48

Fixed Horizon Control Averaging Fixed Horizon Control 48

Online Convex Optimization Using Predictions online convex switching cost Goal: Algorithms to minimize cost

Online Convex Optimization Using Predictions online convex switching cost Goal: Algorithms to minimize cost Time Information Available Decision 1 … 2 … 3 … 49

 Fixed Horizon Control Averaging Fixed Horizon Control 50

Fixed Horizon Control Averaging Fixed Horizon Control 50

Online Convex Optimization Using Predictions online Time Information Available Decision 1 … 2 …

Online Convex Optimization Using Predictions online Time Information Available Decision 1 … 2 … 3 … 4 … Is there online algorithm that achieve sublinear regret and constant CR? Yes! 51

 Averaging Fixed Horizon Control … Average choices of FHC algorithms 52

Averaging Fixed Horizon Control … Average choices of FHC algorithms 52