Convex Optimization for Sequential Game Solving Overview Sequenceform

  • Slides: 23
Download presentation
Convex Optimization for Sequential Game Solving

Convex Optimization for Sequential Game Solving

Overview • Sequence-form transformation • Bilinear saddle-point problems • EGT/Mirror prox • Smoothing techniques

Overview • Sequence-form transformation • Bilinear saddle-point problems • EGT/Mirror prox • Smoothing techniques for sequential games • Sampling techniques • Some experimental results

Extensive-form games C P 1 0 P 1 P 2 -3 P 1 6

Extensive-form games C P 1 0 P 1 P 2 -3 P 1 6 0 1. 5 P 2 0 -6 6

Behavioral strategies C P 1 • Specify a distribution over actions at each information

Behavioral strategies C P 1 • Specify a distribution over actions at each information set P 1 • Strategy: 0 P 2 • Utility for a player: • Probability-weighted sum over leaf nodes • Not linear, even when other player is held fixed • Example: -3 P 1 6 0 1. 5 P 2 0 -6 6

Sequence form C P 1 • Technique for obtaining a linear formulation. P 1

Sequence form C P 1 • Technique for obtaining a linear formulation. P 1 • Exploits perfect recall property. • Information-set probabilities sum to probability of players’ last sequence. 0 P 2 -3 P 1 6 0 1. 5 P 2 0 -6 6

Sequence form C P 1 0 P 1 P 2 -3 P 1 6

Sequence form C P 1 0 P 1 P 2 -3 P 1 6 0 1. 5 P 2 0 -6 6

Sequence form

Sequence form

 C P 1 0 P 1 P 2 -3 P 1 6 0

C P 1 0 P 1 P 2 -3 P 1 6 0 1. 5 P 2 0 -6 6

Equilibrium-finding algorithms

Equilibrium-finding algorithms

Bilinear saddle-point formulation

Bilinear saddle-point formulation

Bilinear saddle-point formulation

Bilinear saddle-point formulation

Smoothed function approximation

Smoothed function approximation

Conditions on DGFs

Conditions on DGFs

Example smoothing function

Example smoothing function

Effect of smoothing

Effect of smoothing

Algorithms

Algorithms

Smoothing for sequential games

Smoothing for sequential games

Strategy space polytopes

Strategy space polytopes

Distance-generating function for treeplexes

Distance-generating function for treeplexes

Sampling

Sampling

 C P 1 0 P 1 P 2 -3 P 1 6 0

C P 1 0 P 1 P 2 -3 P 1 6 0 1. 5 P 2 0 -6 6

Experiments - game Leduc hold’em. ◦ ◦ ◦ Simplified limit Texas hold’em game. Deck

Experiments - game Leduc hold’em. ◦ ◦ ◦ Simplified limit Texas hold’em game. Deck has k unique cards, with two copies of each card. Each player is dealt a single private card. One community card is dealt. A betting round occurs before and after the community card is dealt. Each betting round allows up to three raises.

Experiments

Experiments