Overview • Sequence-form transformation • Bilinear saddle-point problems • EGT/Mirror prox • Smoothing techniques for sequential games • Sampling techniques • Some experimental results
Extensive-form games C P 1 0 P 1 P 2 -3 P 1 6 0 1. 5 P 2 0 -6 6
Behavioral strategies C P 1 • Specify a distribution over actions at each information set P 1 • Strategy: 0 P 2 • Utility for a player: • Probability-weighted sum over leaf nodes • Not linear, even when other player is held fixed • Example: -3 P 1 6 0 1. 5 P 2 0 -6 6
Sequence form C P 1 • Technique for obtaining a linear formulation. P 1 • Exploits perfect recall property. • Information-set probabilities sum to probability of players’ last sequence. 0 P 2 -3 P 1 6 0 1. 5 P 2 0 -6 6
Sequence form C P 1 0 P 1 P 2 -3 P 1 6 0 1. 5 P 2 0 -6 6
Sequence form
C P 1 0 P 1 P 2 -3 P 1 6 0 1. 5 P 2 0 -6 6
Equilibrium-finding algorithms
Bilinear saddle-point formulation
Bilinear saddle-point formulation
Smoothed function approximation
Conditions on DGFs
Example smoothing function
Effect of smoothing
Algorithms
Smoothing for sequential games
Strategy space polytopes
Distance-generating function for treeplexes
Sampling
C P 1 0 P 1 P 2 -3 P 1 6 0 1. 5 P 2 0 -6 6
Experiments - game Leduc hold’em. ◦ ◦ ◦ Simplified limit Texas hold’em game. Deck has k unique cards, with two copies of each card. Each player is dealt a single private card. One community card is dealt. A betting round occurs before and after the community card is dealt. Each betting round allows up to three raises.