Computing BestResponse Strategies in Infinite Games of Incomplete

Computing Best-Response Strategies in Infinite Games of Incomplete Information Daniel Reeves and Michael Wellman University of Michigan

Definitions Infinite Game = infinite action spaces Incomplete Information = payoffs depend on information that is private to the players Type = a player’s private information One-shot Game = players each choose a single action simultaneously and then immediately receive a payoff Strategy = a mapping from type to action Best-Response Strategy = optimal strategy given known strategies of the other players Nash Equilibrium = profile of strategies such that each strategy is a best response to the others Bayes-Nash Equilibrium = generalization of NE to the case of incomplete information, for expected-utility maximizing players

Finite Game Approximations Finite game solvers: n n n Gambit Gala Gametracer Why not discretize? n n Introduces qualitative differences Computationally intractable

Our Class of Games 2 -player, one-shot, infinite games of incomplete information Piecewise uniform type distributions Payoff functions of the form:

Games in our Class Other games: War of Attrition, Incomplete info versions of Cournot and Bertrand games

Piecewise Linear Strategies Specified by the vectors c, m, b

Existence and Computation of Piecewise Linear Best Responses Theorem 1: Given a payoff function with I regions as above, an opponent type distribution with cdf F that is piecewise uniform with J pieces, and a piecewise linear strategy function with K pieces, the best response is itself a piecewise linear function with no more than 2(I-1)(J+K-2) piece boundaries.

The Proof For arbitrary own type t, and opponent type a random variable T, find own action a maximizing ET[u(t, a, T, s(T))] (Numerical maximization not applicable due to parameter t) Above works out to be a piecewise polynomial in a (parameterized by t) For given t, finding optimal a is straightforward Remains to find partitioning of type space such that within each type range, optimal action is a linear function of t This can be done in polynomial time

Example: First-Price Sealed Bid Auction (FPSB) Types (valuations) drawn from U[0, 1] Payoff function: Known Bayes-Nash equilibrium (Mc. Afee & Mc. Millan, 1987): a(t)=t/2 Found in as few as one iteration from a variety of seed strategies

Example: Supply-chain Game Producers’ Costs U[0, 1] Consumer’s Valuation v in [1. 5, 3] (known) Payoff function: bid-cost if bid+bid 2 <= v 0 otherwise Producer 1 Producer 2 Consumer

Proving a Bayes-Nash Equilibrium Candidate Strategy: 2/3 v – 1/2 if cost < 2/3 v – 1 cost/2 + v/3 otherwise Compute best response…

Computing Best Response Expected payoff , EP(b) =(b-c)*p(b+b 2<=v) =(b-c)*[p(c 2<=2/3 v-1)*p(b+2/3 v-1/2<=v | c 2<=2/3 v-1) +p(c 2>2/3 v-1) * p(b+c 2/2+v/3<=v | c 2 > 2/3 v-1)] =(b-c)*[(2/3 v-1)*p(b<=v/3+1/2) +p(2/3 v-1 < c 2 < 4/3 v-2 b)] Case 1: b<=2/3 v-1/2 EP(b) = (b-c)*[(2/3 v-1)*1 + (2 -2/3 v)] = (b-c) ==> b* = 2/3 v-1/2 ==> EP 1(b*) = 2/3 v-1/2 -c Case 2: 2/3 v-1/2 < b < v/3+1/2 EP(b) = (b-c)*[(2/3 v-1)+(2/3 v-2 b+1)] = (b-c)*(4/3 v-2 b) ==> b* = c/2+v/3 ==> EP 2(b*) = (3 c-2 v)^2/18 Case 3: b > v/3+1/2 ==> EP 3(b) = 0

Computing Best Response (2) EP 1(b*) > EP 2(b*) iff c < 2/3 v – 1 Therefore, best-response is… 2/3 v – 1/2 c/2 + v/3 if c < 2/3 v – 1 otherwise

Example: Bargaining Game (aka, sealed-bid k-double auction) Buyer and seller place bids, transaction happens iff they overlap Transaction price is some linear combination of the bids Known equilibrium (Chatterjee & Samuelson, 1983) for seller (1) and buyer (2): Found in several iterations from truthful bidding

Provision Point Mechanism (aka, Public Good or Voluntary Participation game) 2 agents want to jointly acquire a good costing C Mechanism: simultaneously offer contributions; buy iff sum > C and split the excess (C – sum) evenly Nash: 2/3 t + C/4 – 1/6

Shared-Good Auction New mechanism, similar to the divorcesettlement game; undoes provisionpoint Agents place bids for a good they currently share, valuations ~U[A, B] High bidder gets the good and pays half its bid to the low bidder in compensation

Equilibrium in Shared-Good Auction Found in one iteration from truthful bidding (for any specific [A, B])

Vicious Vickrey Auction Generalization of a Vickrey Auction (Brandt & Weiss, 2001) to allow for disutility from opponent’s utility (eg, business competitors) Brandt & Weiss consider only the complete information version

Equilibrium in Vicious Vickrey a(t) = (k+t)/(k+1) Reduces to truthful bidding for the standard Vickrey Auction (k=0) Iterated best-response solver finds this equilibrium (for specific values of k) within several iterations from a variety of seed strategies

Conclusions First algorithm for finding best-response strategies in a broad class of infinite games of incomplete information Confirms known equilibria (eg, FPSB), confirms equilibria we derive here (Supply. Chain game), discovers new equilibria (Shared-good auction, Vicious Vickrey) Goal: characterize the class of games for which iterated best-response converges