Reinforcement Learning for Integer Programming Learning to Cut

  • Slides: 26
Download presentation
Reinforcement Learning for Integer Programming Learning to Cut Yunhao Tang

Reinforcement Learning for Integer Programming Learning to Cut Yunhao Tang

Thank you to my collaborators! Shipra Agrawal Yuri Faenza

Thank you to my collaborators! Shipra Agrawal Yuri Faenza

Machine Learning (ML) for Combinatorial Optimization (CO) • How can ML help solve CO

Machine Learning (ML) for Combinatorial Optimization (CO) • How can ML help solve CO problem? • Directly predict the solutions? • Parameterization of the solutions, architecture… Pure ML based predictions • Neural architecture to mimic algorithmic computations? • ML flexibility + algorithmic inductive biases… • Combine ML with hard-coded algorithms • • Automate certain decision making of algorithms, replace heuristics Branch-and-bound [1, 2, 3] Primal heuristics [3] ML + Pure Cutting plane methods [4] classic algorithms Gasse et al, 2019; Zerpellon, et al, 2020; Nair et al, 2021; Tang et al, 2020

Why cutting plane methods? • Cutting planes are a backbone of modern commercial solvers

Why cutting plane methods? • Cutting planes are a backbone of modern commercial solvers • Agnostic to underlying problems • Very powerful: efficient algorithms for hard problems such as TSP • Challenge: no very well-established principle for cutting plane selection no good supervision • Branch-and-bound: FSB as supervised learning oracles [1, 2, 3] • Primal heuristics: solutions as supervised learning oracles [3] • Cutting plane: ? ? ? Gasse et al, 2019; Zerpellon, et al, 2020; Nair et al, 2021

Background: Integer Programming • Integer Programming (IP) formulation Graph problems Planning problems Resource allocation

Background: Integer Programming • Integer Programming (IP) formulation Graph problems Planning problems Resource allocation • Solving general IP is challenging • Discrete nature of the problem: continuous relaxations can be arbitrarily bad

Cutting planes -- background • Main idea: solve the Linear Programming (LP) relaxations!

Cutting planes -- background • Main idea: solve the Linear Programming (LP) relaxations!

Cutting planes – what to learn • Cutting planes are generated from the relaxations

Cutting planes – what to learn • Cutting planes are generated from the relaxations • How to generate cuts? Gomory’s cuts [1] • Very general purpose – can be computed from the LP tableau • The procedure can be shown to terminate • The number of iterations (can be exponential ) depends on the sequence of cuts added • Can we learn which cutting plane to add to minimize the total efforts? Gomory, 1960

Intuitions: A 2 -D example •

Intuitions: A 2 -D example •

Not an one-shot decision problem •

Not an one-shot decision problem •

Reinforcement Learning •

Reinforcement Learning •

Cutting plane selection as RL problem •

Cutting plane selection as RL problem •

Cutting plane selection as RL problem •

Cutting plane selection as RL problem •

 • Vaswani et al, 2017

• Vaswani et al, 2017

 • A simple example MLP

• A simple example MLP

Optimize the policy • Salisman et al, 2015; Mania et al, 2018

Optimize the policy • Salisman et al, 2015; Mania et al, 2018

Experiment results • Wesselmann et al, 2012; Gomory, 1960

Experiment results • Wesselmann et al, 2012; Gomory, 1960

Q 1: Can RL minimize number of cuts?

Q 1: Can RL minimize number of cuts?

Q 2: Can RL quickly maximize the closure? •

Q 2: Can RL quickly maximize the closure? •

Q 2: Can RL maximize the closure?

Q 2: Can RL maximize the closure?

Q 2: Can RL maximize the closure?

Q 2: Can RL maximize the closure?

Q 3: Can RL benefit downstream applications? • In practice, cutting planes are applied

Q 3: Can RL benefit downstream applications? • In practice, cutting planes are applied along with other algorithms, most commonly, Branch-and-Bound (B&B) • What is B&B? • Branch: partition the search space • Bound: use LP relaxations to bound values • Adding cuts to B&B nodes • Tighten LP relaxations better bound

Q 3: Can RL benefit downstream applications? • Measure progress in a B&B tree

Q 3: Can RL benefit downstream applications? • Measure progress in a B&B tree • Step 1: Train cutting plane RL agent with cutting plane objective – without knowledge about downstream applications • Step 2: Directly test with B&B

Q 3: Can RL benefit downstream applications?

Q 3: Can RL benefit downstream applications?

Q 3: Can RL benefit downstream applications? • We can also count the number

Q 3: Can RL benefit downstream applications? • We can also count the number of nodes it takes to complete B&B…

Summary • We formulate cutting plane selection as a RL problem • We can

Summary • We formulate cutting plane selection as a RL problem • We can train RL agents that out-perform human-designed heuristics • Limitation: • Problems are relatively small scale… • Add only one cut at a time… • Future: RL for other IP algorithmic components?

Thank you for your attention! Please check out the full paper RL for IP:

Thank you for your attention! Please check out the full paper RL for IP: Learning to Cut, ICML 2020