A numerical analysis approach to convex optimization Speaker

  • Slides: 128
Download presentation
A numerical analysis approach to convex optimization Speaker Rasmus Kyng ETH Zurich Joint works

A numerical analysis approach to convex optimization Speaker Rasmus Kyng ETH Zurich Joint works with D. Adil, R. Peng, S. Sachdeva, D. Wang March, 2020

The problem

The problem

The problem

The problem

The problem

The problem

The problem

The problem

The problem

The problem

The problem

The problem

The problem

The problem

Convex optimization, high accuracy slow running time fast running time Inf-norm regression 2 -norm

Convex optimization, high accuracy slow running time fast running time Inf-norm regression 2 -norm regression Karmarkar; Chinese mathematicians, 2 nd century Renegar; Ye; Nesterov & Nemirovski Gauss, 19 th century

Convex optimization, high accuracy slow running time fast running time Nesterov & Nemirovski ‘

Convex optimization, high accuracy slow running time fast running time Nesterov & Nemirovski ‘ 94 p-norm regression Inf-norm regression 2 -norm regression

Convex optimization, high accuracy slow running time fast running time Bubeck, Cohen, Lee, Li

Convex optimization, high accuracy slow running time fast running time Bubeck, Cohen, Lee, Li ‘ 18 p-norm regression Nesterov & Nemirovski ‘ 94 p-norm regression Inf-norm regression 2 -norm regression

Our result compared to BCLL Our result

Our result compared to BCLL Our result

Our result compared to BCLL Our result

Our result compared to BCLL Our result

Convex optimization, high accuracy slow running time fast running time Adil, K. , Peng,

Convex optimization, high accuracy slow running time fast running time Adil, K. , Peng, Sachdeva ‘ 19 p-norm regression Bubeck, Cohen, Lee, Li ‘ 18 p-norm regression Inf-norm regression 2 -norm regression

Convex optimization, high accuracy slow running time fast running time Adil, K. , Peng,

Convex optimization, high accuracy slow running time fast running time Adil, K. , Peng, Sachdeva ‘ 19 p-norm regression Bubeck, Cohen, Lee, Li ‘ 18 p-norm regression Inf-norm regression 2 -norm regression “Smooth” convex problems? !

Convex optimization, high accuracy slow running time fast running time Lee & Sidford ‘

Convex optimization, high accuracy slow running time fast running time Lee & Sidford ‘ 15 Inf-norm regression, w/ inverse maintenance Inf-norm regression Inverse maintenance: Karmarkar; Nesterov & Nemirovski; Vaidya; Lee & Sidford 2 -norm regression p-norm regression w/ inverse maintenance Adil, K. , Peng, Sachdeva ‘ 19 Inf-norm regression, w/ inverse maintenance Cohen, Lee, Song ‘ 19

Convex optimization, high accuracy slow running time fast running time 2 -norm regression p-norm

Convex optimization, high accuracy slow running time fast running time 2 -norm regression p-norm regression w/ inverse maintenance Inf-norm regression, w/ inverse maintenance

Convex optimization, high accuracy slow running time fast running time Bubeck, Cohen, Lee, Li

Convex optimization, high accuracy slow running time fast running time Bubeck, Cohen, Lee, Li ’ 18 p-norm flow 2 -norm regression p-norm regression w/ inverse maintenance Inf-norm regression, w/ inverse maintenance inf-norm flow aka maxflow Goldberg-Rao `98 2 -norm flow Spielman Teng ‘ 04

Convex optimization, high accuracy slow running time fast running time Bubeck, Cohen, Lee, Li

Convex optimization, high accuracy slow running time fast running time Bubeck, Cohen, Lee, Li ’ 18 p-norm flow 2 -norm regression inf-norm flow aka maxflow 2 -norm flow Adil, K. , Peng, Sachdeva ‘ 19 p-norm regression w/ inverse maintenance Inf-norm regression, w/ inverse maintenance p-norm flow most flows? !

Convex optimization, high accuracy slow running time fast running time unweighted maxflow Madry ‘

Convex optimization, high accuracy slow running time fast running time unweighted maxflow Madry ‘ 13 2 -norm regression p-norm regression w/ inverse maintenance Inf-norm regression, w/ inverse maintenance inf-norm flow aka maxflow 2 -norm flow unweighted (large) p-norm flow K. , Peng, Sachdeva, Wang ‘ 19 most flows? !

Convex optimization, high accuracy slow running time fast running time unweighted maxflow Liu-Sidford ‘

Convex optimization, high accuracy slow running time fast running time unweighted maxflow Liu-Sidford ‘ 2020 2 -norm regression p-norm regression w/ inverse maintenance Inf-norm regression, w/ inverse maintenance inf-norm flow aka maxflow 2 -norm flow unweighted (large) p-norm flow K. , Peng, Sachdeva, Wang ‘ 19 most flows? !

Convex optimization

Convex optimization

2 nd order methods local upper bound (quadratic) local lower bound (quadratic)

2 nd order methods local upper bound (quadratic) local lower bound (quadratic)

Homotopy: making 2 nd order methods global Solve a sequence of different optimization problems,

Homotopy: making 2 nd order methods global Solve a sequence of different optimization problems, ensure starting point near next optimum

Homotopy: making 2 nd order methods global Solve a sequence of different optimization problems,

Homotopy: making 2 nd order methods global Solve a sequence of different optimization problems, ensure starting point near next optimum

Homotopy: making 2 nd order methods global Solve a sequence of different optimization problems,

Homotopy: making 2 nd order methods global Solve a sequence of different optimization problems, ensure starting point near next optimum

Homotopy: making 2 nd order methods global Solve a sequence of different optimization problems,

Homotopy: making 2 nd order methods global Solve a sequence of different optimization problems, ensure starting point near next optimum

 local upper bound (quadratic) local lower bound (quadratic)

local upper bound (quadratic) local lower bound (quadratic)

 global upper bound (? ? ) global lower bound (? ? )

global upper bound (? ? ) global lower bound (? ? )

 global upper bound (? ? ) global lower bound (? ? ) What

global upper bound (? ? ) global lower bound (? ? ) What makes our “envelopes” tractable?

 global upper bound (? ? ) global lower bound (? ? ) What

global upper bound (? ? ) global lower bound (? ? ) What makes our “envelopes” tractable?

 global upper bound (? ? ) global lower bound (? ? )

global upper bound (? ? ) global lower bound (? ? )

The ingredients of an algorithm

The ingredients of an algorithm

 The envelopes (upper and lower bounds)

The envelopes (upper and lower bounds)

Good envelopes?

Good envelopes?

Good envelopes? Step progress

Good envelopes? Step progress

Good envelopes? Step progress Mult. weight oracle

Good envelopes? Step progress Mult. weight oracle

Good envelopes!

Good envelopes!

Good envelopes! linear apx.

Good envelopes! linear apx.

Good envelopes! linear apx. local curvature

Good envelopes! linear apx. local curvature

Good envelopes! linear apx. local curvature "long-range behavior” Enough to give a global approximation!

Good envelopes! linear apx. local curvature "long-range behavior” Enough to give a global approximation!

 linear apx. "long-range behavior” local curvature

linear apx. "long-range behavior” local curvature

 lower bound upper bound Step progress Mult. weight oracle

lower bound upper bound Step progress Mult. weight oracle

Good envelopes! Long range: 4 th power

Good envelopes! Long range: 4 th power

Good envelopes! Long range: 4 th power Locally quadratic

Good envelopes! Long range: 4 th power Locally quadratic

Iterative refinement with multiplicative weight method step problem Multiplicative weight method

Iterative refinement with multiplicative weight method step problem Multiplicative weight method

 Using multiplicative weights to take a step

Using multiplicative weights to take a step

Crude solver via multiplicative weights

Crude solver via multiplicative weights

Multiplicative weights refresher

Multiplicative weights refresher

But when is the oracle bad?

But when is the oracle bad?

But when is the oracle bad? 1 unit on every edge

But when is the oracle bad? 1 unit on every edge

But when is the oracle bad?

But when is the oracle bad?

But when is the oracle bad?

But when is the oracle bad?

The dirty secrets

The dirty secrets

Crude solver via multiplicative weights

Crude solver via multiplicative weights

 Why do we call it “iterative refinement”?

Why do we call it “iterative refinement”?

Iterative refinement step problem A. k. a. iterative refinement

Iterative refinement step problem A. k. a. iterative refinement

Iterative refinement for linear equations step problem A. k. a. iterative refinement

Iterative refinement for linear equations step problem A. k. a. iterative refinement

Iterative refinement For linear equations, build the “crude” solver by preconditioning For our non-linear

Iterative refinement For linear equations, build the “crude” solver by preconditioning For our non-linear equations build the “crude” solver by multiplicative weights Or we can build a solver by recursive preconditioning Must be nonlinear, adaptive?

Recap [Adil-K. -Peng-Sachdeva]

Recap [Adil-K. -Peng-Sachdeva]

 Going much further for flows

Going much further for flows

Going much further for flows

Going much further for flows

Iterative refinement step problem A. k. a. iterative refinement

Iterative refinement step problem A. k. a. iterative refinement

Iterative refinement with multiplicative weight method step problem Multiplicative weight method

Iterative refinement with multiplicative weight method step problem Multiplicative weight method

Recursive Preconditioning? step problem Recurse! Solved by recursion

Recursive Preconditioning? step problem Recurse! Solved by recursion

Recursive Preconditioning? step problem Recurse! Solved by recursion

Recursive Preconditioning? step problem Recurse! Solved by recursion

The recursion works out

The recursion works out

Smaller instances: ultrasparsification

Smaller instances: ultrasparsification

Expanders

Expanders

Expanders

Expanders

Expanders

Expanders

Expanders What are the flow maps?

Expanders What are the flow maps?

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification

Repeat Smaller instances: Ultrasparsification Preserve linear, quadratic, and pth power simultaneously? Ensure projection of

Repeat Smaller instances: Ultrasparsification Preserve linear, quadratic, and pth power simultaneously? Ensure projection of gradient to partition cycle space is uniform.

Repeat Smaller instances: Ultrasparsification Non-linear gaussian elimination? It helps that we are dealing with

Repeat Smaller instances: Ultrasparsification Non-linear gaussian elimination? It helps that we are dealing with a circulation!

Recap

Recap

 Application to maximum flow

Application to maximum flow

But when is the oracle bad?

But when is the oracle bad?

But when is the oracle bad?

But when is the oracle bad?

But when is the oracle bad? Reweigh

But when is the oracle bad? Reweigh

But when is the oracle bad?

But when is the oracle bad?

Open questions Remove unit weight restriction Generalize to most convex flow problems? Maximum flow?

Open questions Remove unit weight restriction Generalize to most convex flow problems? Maximum flow? (iterative refinement works) Preconditioning outside graph-land?

Thanks! My co-authors Deeksha Adil Richard Peng Sushant Sachdeva UToronto Ga. Tech UToronto Di

Thanks! My co-authors Deeksha Adil Richard Peng Sushant Sachdeva UToronto Ga. Tech UToronto Di Wang Ga. Tech

 Folklore (Spielman, Madry)

Folklore (Spielman, Madry)

 1 unit on every edge

1 unit on every edge

Thanks! My co-authors Deeksha Adil Richard Peng Sushant Sachdeva UToronto Ga. Tech UToronto Di

Thanks! My co-authors Deeksha Adil Richard Peng Sushant Sachdeva UToronto Ga. Tech UToronto Di Wang Ga. Tech

TODO Reconcile m and choices? Decision on bold vs non-bold vectors Introduce flow problems

TODO Reconcile m and choices? Decision on bold vs non-bold vectors Introduce flow problems somewhere Be clearer about additive vs multiplicative error! Be clear about recursion errors