Reinforcement Learning for Integer Programming Learning to Cut


























- Slides: 26
Reinforcement Learning for Integer Programming Learning to Cut Yunhao Tang
Thank you to my collaborators! Shipra Agrawal Yuri Faenza
Machine Learning (ML) for Combinatorial Optimization (CO) • How can ML help solve CO problem? • Directly predict the solutions? • Parameterization of the solutions, architecture… Pure ML based predictions • Neural architecture to mimic algorithmic computations? • ML flexibility + algorithmic inductive biases… • Combine ML with hard-coded algorithms • • Automate certain decision making of algorithms, replace heuristics Branch-and-bound [1, 2, 3] Primal heuristics [3] ML + Pure Cutting plane methods [4] classic algorithms Gasse et al, 2019; Zerpellon, et al, 2020; Nair et al, 2021; Tang et al, 2020
Why cutting plane methods? • Cutting planes are a backbone of modern commercial solvers • Agnostic to underlying problems • Very powerful: efficient algorithms for hard problems such as TSP • Challenge: no very well-established principle for cutting plane selection no good supervision • Branch-and-bound: FSB as supervised learning oracles [1, 2, 3] • Primal heuristics: solutions as supervised learning oracles [3] • Cutting plane: ? ? ? Gasse et al, 2019; Zerpellon, et al, 2020; Nair et al, 2021
Background: Integer Programming • Integer Programming (IP) formulation Graph problems Planning problems Resource allocation • Solving general IP is challenging • Discrete nature of the problem: continuous relaxations can be arbitrarily bad
Cutting planes -- background • Main idea: solve the Linear Programming (LP) relaxations!
Cutting planes – what to learn • Cutting planes are generated from the relaxations • How to generate cuts? Gomory’s cuts [1] • Very general purpose – can be computed from the LP tableau • The procedure can be shown to terminate • The number of iterations (can be exponential ) depends on the sequence of cuts added • Can we learn which cutting plane to add to minimize the total efforts? Gomory, 1960
Intuitions: A 2 -D example •
Not an one-shot decision problem •
Reinforcement Learning •
Cutting plane selection as RL problem •
Cutting plane selection as RL problem •
• Vaswani et al, 2017
• A simple example MLP
Optimize the policy • Salisman et al, 2015; Mania et al, 2018
Experiment results • Wesselmann et al, 2012; Gomory, 1960
Q 1: Can RL minimize number of cuts?
Q 2: Can RL quickly maximize the closure? •
Q 2: Can RL maximize the closure?
Q 2: Can RL maximize the closure?
Q 3: Can RL benefit downstream applications? • In practice, cutting planes are applied along with other algorithms, most commonly, Branch-and-Bound (B&B) • What is B&B? • Branch: partition the search space • Bound: use LP relaxations to bound values • Adding cuts to B&B nodes • Tighten LP relaxations better bound
Q 3: Can RL benefit downstream applications? • Measure progress in a B&B tree • Step 1: Train cutting plane RL agent with cutting plane objective – without knowledge about downstream applications • Step 2: Directly test with B&B
Q 3: Can RL benefit downstream applications?
Q 3: Can RL benefit downstream applications? • We can also count the number of nodes it takes to complete B&B…
Summary • We formulate cutting plane selection as a RL problem • We can train RL agents that out-perform human-designed heuristics • Limitation: • Problems are relatively small scale… • Add only one cut at a time… • Future: RL for other IP algorithmic components?
Thank you for your attention! Please check out the full paper RL for IP: Learning to Cut, ICML 2020