Exploring Bidding Strategies for MarketBased Scheduling Daniel M

Exploring Bidding Strategies for Market-Based Scheduling Daniel M. Reeves, Micheal P. Wellman, Jeffrey K. Mac. Kie-Mason, Anna Osepayshvili (University of Michigan Ann Arbor) 9/20/2021 a. Rman a. Rtuc for CSC 84200 1

Strategies for Complex market games ¡ ¡ Allocation problem: deciding how to assign the available resources to agents. Centralized vs. distributed information; centralized has l l ¡ Resource allocation mechanism: l ¡ ¡ superior results, not applicable : private interests of the agents. the communication process that determines which agents get which resources based on the messages exchanged. Market games are difficult to solve, even more complicated when resources are complements for some agents No known optimal bidding strategy in multiple item simultaneous ascending auctions. 9/20/2021 a. Rman a. Rtuc for CSC 84200 2

Market based scheduling ¡ ¡ ¡ 9/20/2021 Scheduling problem = resource allocation problem + resources are distinguished by the time periods in which they are available so that a schedule is an allocation of these resources over time. Market based scheduling: a configuration of markets that allocates resources over time. Focus on the strategic problem faced by an agent participating in a market based scheduling mechanism. a. Rman a. Rtuc for CSC 84200 3

Simultaneous Auctions ¡ ¡ ¡ 9/20/2021 Simultaneous ascending auctions work well when there exist one competitive price equilibrium. (When the goods are substitutes for the multiple goods case) But such simultaneous ascending auctions can fail badly when there are complements. How should agents behave when faced with separate markets for complements? a. Rman a. Rtuc for CSC 84200 4

Scheduling problem definition ¡ ¡ ¡ 9/20/2021 M units of a schedulable resources (time slots), 1, …, M Each N agents have a single job that can be accomplished using the resource. Agent j’s job requires λj timeslots to complete and if j acquires λj timeslots by deadline t, it accrues value vj(t) Single unit: λj = 1, else multiple unit. Fixed deadline: every agent has the same deadline, else variable deadline. a. Rman a. Rtuc for CSC 84200 5

Factory Scheduling 9/20/2021 a. Rman a. Rtuc for CSC 84200 6

Auction mechanism ¡ ¡ ¡ ¡ 9/20/2021 A separate auction is run for each slot Bid price on slot m, βm: highest bid bjm made Ask price on slot m: m = βm+ε. Bids must satisfy bjm > m. An auction is quiescent when a round passes with no admissible bids. Auctions proceed concurrently. When all of them are simultaneously quiescent, all close and allocate their respective slots per the last admitted bids. Because no slot is committed until they all are, an agent’s bidding strategy on one slot cannot be contingent on the outcome for another slot. a. Rman a. Rtuc for CSC 84200 7

Straightforward bidding – SB ¡ ¡ ¡ 9/20/2021 SB takes a vector of perceived prices for the slots SB bids those prices for the bundle slots that would maximize the agent’s surplus if it were to win all of its bids at those prices. If agent j is assigned to a set of slots X, it accrues vj(X) and if it obtains X at prices p, surplus is σ(X, p) = vj(X) - Σm∈X pm. If agent j is winning X-1 in the previous round, current perceived prices are βm, m∈ X-1 or m otherwise. SB agent j bids bjm = pm for m∈X*, such that X*= arg max. Xσ(X, p) a. Rman a. Rtuc for CSC 84200 8

Baseline Strategy Performance ¡ ¡ 9/20/2021 No anticipation of other agents’ strategies. No-regret: from the agent’s perspective no bidding policy other than the current one would have been a better response to the other agents’ bids. SB is a Bayesian equilibrium for single-unit fixed-deadline problem. But for multiple-unit problems, allocations differ from the optimal by large amounts. a. Rman a. Rtuc for CSC 84200 9

Baseline Strategy Performance-2 ¡ ¡ 9/20/2021 One SB path leads to a state where agent 2 is winning both slots at prices (4, 3) Then in the next round agent 1 would bid 4 for the second slot and the auction would end agent 1 receiving slot 2 and agent 2 receiving slot 1 at a price of 4. Though SB leads a result with value 5, the optimal would have produced 8. Agent 2 may have offered 5 for slot 2 and would be better of with a surplus of -1 rather than -4!! SB is not reasonable for a general strategy a. Rman a. Rtuc for CSC 84200 10

Alternative bidding Strategies ¡ ¡ ¡ How to find an equilibrium strategy for the simultaneous ascending auction for simple scheduling? Space of joint preferences is very large (Preference of an agent depends on job length plus payoff for each of M deadlines) (M+1) x N dimensional. Number of bidding rounds can be quite large (small bid increments) Strategy space is all functions mapping the Cartesian product of the space of preferences and the space of all price-quote histories into a vector of next round bids. Finding optimal by enumeration is computationally infeasible. So how to derive an optimal strategy? With complement slots, there is a problem of exposure. Exposure problem: In order to obtain the combination it prefers it must expose itself to the risk of paying for a far less desirable (or worthless) subset. 9/20/2021 a. Rman a. Rtuc for CSC 84200 11

Sunk Awareness ¡ ¡ ¡ OPPORTUNITY COST OF NOT BIDDING is ignored by SB. SB agents are bidding as if the incremental cost for slots they are currently winning is the full price, but actually incremental cost is zero. (The cost is sunk!) Sunk-aware strategy: permit agents to account for the true incremental cost of slots they are currently winning. A sunk aware agent bids as if the incremental cost for slots currently winning is on the interval of zero and the current bid price. Agent j’s perceived price for slot m = k. βm if j is winning m, and βm+ε otherwise where k is the sunk awareness parameter k ∈ [0, 1]. 9/20/2021 a. Rman a. Rtuc for CSC 84200 12

Proposed method ¡ ¡ ¡ 9/20/2021 Select a set of candidate strategies and then evaluate their performance against each other through statistical simulation based on an evolutionary game. Strategies are assigned population frequencies and samples of agents compete against each other. Strategies that perform well are rewarded with higher population frequencies. Poor strategies weeded out. a. Rman a. Rtuc for CSC 84200 13

Generating Payoff Matrices ¡ ¡ ¡ 9/20/2021 Estimate the payoff matrix for a restricted game. Strategy function that maps agent preferences + auction information to bids Construct agents implementing selected strategies and calculate the expected payoffs with respect to specified distributions from which agents are drawn. Consider only reflex agents: only information from the current auction round, nothing previous. Consider only a specific parameterized family of strategy functions (only sunk-awareness for this study) a. Rman a. Rtuc for CSC 84200 14

Generating Payoff Matrices - 2 ¡ ¡ 9/20/2021 An element in the matrix is an N-vector of expected payoffs associated with a particular strategy profile. A profile: {0. 5, 1, 1, 1} : 5 agents A distinct element in the payoff matrix for each possible strategy combination. Estimate an entry using MC simulator drawing preferences and assigning them to agents, simulating the auction for the given strategy profile to quiescence and averaging surplus. a. Rman a. Rtuc for CSC 84200 15

Evolutionary Search for Equilibria ¡ ¡ ¡ 9/20/2021 An agent population that has reached a fixed point with respect to the replicator dynamics will be a candidate NE (for mixed strategy) Every pure strategy >0 representatives in the fixed -state population does as well in expectation against others. Iterative algorithm for finding equilibrium. Increase the proportion of good-performing strategies at the expense of the others. Pg(s) ~ pg-1(s) (EPs-W) For EP: average payoffs for s in profiles it appears. a. Rman a. Rtuc for CSC 84200 16

Game settings ¡ ¡ Use monte carlo simulation to generate an expected payoff matrix for every combination of strategies playing against each other. Then find Nash Equilibria with l l l 9/20/2021 Replicator dynamics (evolutionary tournament) GAMBIT (a computational game solver) Amoeba (a function minimization algorithm) a. Rman a. Rtuc for CSC 84200 17

Solving Payoff Matrices with Gambit ¡ ¡ 9/20/2021 Gambit takes the full matrix representation of a strategic form game iteratively eliminates strongly dominated strategies applying a subdivision algorithm, enumerates all Nash equilibriums. it cannot take advantage of symmetry in a payoff matrix, which reduces the number of distinct profiles drastically. a. Rman a. Rtuc for CSC 84200 18

Searching for Equilibriums with Amoeba ¡ Nash equilibrium is a global minimum of the following function ¡ Where u(x, p) is the payoff from playing strategy x against everyone else playing strategy p. For any p∈NE => f(p) is zero. ¡ 9/20/2021 a. Rman a. Rtuc for CSC 84200 19

Replicator Dynamics and Biased Sampling ¡ ¡ ¡ 9/20/2021 Payoff matrix calculation and equilibria search together. Initial set of population proportions for each pure strategy. Sample from the preference distribution, iterating to quiescence. (strategies are randomly drawn to participate according to their population proportions) (successful strategies are sampled more often!) After a couple samples, apply the replicator dynamics using the “realized” average payoffs and iterate calculating a sequence of new generations until population proportions are stationary. Accumulates a statically precise estimate of the expected payoff matrix. a. Rman a. Rtuc for CSC 84200 20

Experiments with Sunk-Awareness ¡ ¡ ¡ k=multiples of 1/20 from 0 to 20, designated by 0, 1, …, 20 for simplicity Fix M, N and ε Vary the distributions for preferences l l l ¡ 9/20/2021 Uniform job length ~U[1, M] Constant job length (fixed for all j) Exponential job lengths (drawn from a exponential distribution) Deadline values for each slot are initialized as integers ~U[1, 50]. a. Rman a. Rtuc for CSC 84200 21

Uniform Job Length – Figure 1 ¡ 9/20/2021 Representation of payoff matrix for the restricted game with strategies {18, 18, 18}, {18, 18, 19}, … , {20, 20, 20} a. Rman a. Rtuc for CSC 84200 22

Uniform Job Length – Figure 1 9/20/2021 a. Rman a. Rtuc for CSC 84200 23

Uniform Job Length – Figure 2 ¡ ¡ 9/20/2021 Running payoff matrix through the replicator dynamics. Population evolves to all playing 20, which is NE This can be deduced from the payoff matrix where all-20 profile scores the most. 20 is a dominant strategy for this game and the only NE. a. Rman a. Rtuc for CSC 84200 24

Uniform Job Length – Figure 2 9/20/2021 a. Rman a. Rtuc for CSC 84200 25

Uniform Job Length – Figure 3 ¡ 9/20/2021 Replicator dynamics converge to the same NE independent of the initial populations. a. Rman a. Rtuc for CSC 84200 26

Constant Job Length – Figure 4 ¡ 9/20/2021 Fix λj = 2 for all j and consider the set of strategies {16, 17, 18, 19, 20} Evolutionary dynamics converge to {0. 745, 0. 255, 0, 0, 0}: a mixed strategy NE a. Rman a. Rtuc for CSC 84200 27

Constant Job Length – Figure 5 ¡ 9/20/2021 Fix λj = 2 for all j and consider the set of strategies {0, 8, 12, 16, 20} Evolutionary dynamics converge to {0, 0, 0, 1, 0} Confirmed by GAMBIT a. Rman a. Rtuc for CSC 84200 28

Exponential Job Length – Figure 6 ¡ ¡ ¡ 9/20/2021 Consider the set of strategies {16, 17, 18, 19, 20} Evolutionary dynamics converge to {0, 1, 0, 0, 0} No unique equilibrium by GAMBIT a. Rman a. Rtuc for CSC 84200 29

Vary no of Players: Two Agents ¡ ¡ 9/20/2021 Exponential preferences. Investigate strategies {0, 3, 6, 8, 10, 11, …, 17, 18, 20}: 105 profiles. NE : everybody plays 15. GAMBIT: this is one of three symmetric equilibriums. a. Rman a. Rtuc for CSC 84200 30

Discussion ¡ ¡ 9/20/2021 For 8 and 10 players => k=1 is dominant. Using exponential preferences, equilibrium k value is monotone in the number of agents N. a. Rman a. Rtuc for CSC 84200 31

Sensitivity Analysis ¡ ¡ ¡ 9/20/2021 Is the equilibriums found are robust or would they change with further sampling? There is a probability distribution for each of the expected payoffs in the payoff matrix. Sampling from these distributions independently, generate new payoff matrices and check for the equilibrium, . Several of the results presented are impervious to sampling noise! a. Rman a. Rtuc for CSC 84200 32

Best Response to SB ¡ ¡ ¡ 9/20/2021 Cannot derive an unrestricted characterization of equilibrium behavior in the full strategy space, but restricted equilibriums in selected environments by simulation. Relax the restriction from one agent, still constraining others. What if all except one are SB? What is the best response? Can this be characterized as a variant of SB? a. Rman a. Rtuc for CSC 84200 33

Best Response to SB – 2 ¡ ¡ 9/20/2021 It is illustrated that even relatively simple scenarios with one or two SBs can call for rather sophisticated bidding strategies. Be skeptical that any simple strategy form will capture general situations where information revelation is pivotal. a. Rman a. Rtuc for CSC 84200 34

Conclusion ¡ ¡ ¡ 9/20/2021 It is difficult to draw conclusions about strategy choices in even a relatively simple simultaneous ascending auction game. No analytic methods! Coordinating the allocation of all significantly related resources in the world through a single mechanism is infeasible. (Simultaneous auctions!) For particular environments, it is possible to derive constrained equilibrium through search. a. Rman a. Rtuc for CSC 84200 35

Some pointers! ¡ ¡ ¡ 9/20/2021 Trading Agent Competition (TAC) http: //tac. eecs. umich. edu/ Wellman, Walsh, Eurman, Mac. Kie-Mason “Auction Protocols for Decentralized Scheduling” Games and Economic Behaviour 35, 271 -303 (2001) Reeves, Wellman, Mac. Kie-Mason, Osepayshvili, “Exploring Bidding Strategies for Market-Based Scheduling”, ACM Conference on Electronic Commerce, 2003 a. Rman a. Rtuc for CSC 84200 36