Introduction to Collectives Kagan Tumer NASA Ames Research

Ames Research Center Outline • Introduction to collectives – Definition / Motivation – A

Ames Research Center Motivation • Most complex systems, not only can be, but need

Collectives Ames Research Center • A Collective is – A (perhaps massive) set of

Natural Example: Human Economy Ames Research Center • World utility is GDP – Agents

Nomenclature Ames Research Center h : an agent z : state of all agents

Ames Research Center Key Concepts for Collectives • Factoredness: Degree to which an agent’s

Central Equation of Collectives Ames Research Center • Our ability to control system consists

Ames Research Center • • Outline Introduction to Collectives – Definition / Motivation –

Ames Research Center Autonomous Defects Problem • Given a collection of faulty devices, how

Autonomous Defects Problem (N=100) Ames Research Center CDCS 2002 K. Tumer 12

Autonomous Defects Problem (N=1000) Ames Research Center CDCS 2002 K. Tumer 13

Autonomous Defects Problem: Scaling Ames Research Center CDCS 2002 K. Tumer 14

Ames Research Center Personal Utility • Recall central equation: Factoredness Learnability • Solve for

Ames Research Center Aristocrat Utility • One can solve for factored U with maximal

Ames Research Center Clamping parameter CLhv: replace h’s state (taken to be unary vector)

Ames Research Center Wonderful Life Utility • The Wonderful Life Utility (WLU) for h

Ames Research Center El Farol Bar Problem • Congestion game: A game where agents

Ames Research Center Modified El Farol Bar Problem • Each week agents select one

Ames Research Center Personal Utility Functions • Two conventional utilities: – Uniform Division (UD):

Bar Problem: Utility Comparison Ames Research Center (Attend one night, 60 agents, c=3) CDCS

Ames Research Center Typical Daily Bar Attendance (c=6; t=1000 s ; Number of agents

Scaling Properties (attend one night) Ames Research Center c=2, 3, 4, 6, 8, 10,

Performance vs. # of Nights to Attend Ames Research Center 60 agents; c= 3,

Ames Research Center Collectives of Rovers • Design a collective of autonomous agents to

Ames Research Center World Utility • Token value function: – L : Location Matrix

Ames Research Center Payoff Utilities • Selfish Utility : • Team Game Utility :

Utility Comparison in Rover Domain Ames Research Center 100 rovers on a 32 x

Ames Research Center CDCS 2002 Scaling Properties in Rover Domain K. Tumer 32

Summary Ames Research Center • Given a world utility, deploying RL algorithms provides a

Slides: 33

Download presentation

Introduction to Collectives Kagan Tumer NASA Ames Research Center kagan@ptolemy. arc. nasa. gov http: //ic. arc. nasa. gov/~kagan http: //ic. arc. nasa. gov/projects/COIN/index. html (Joint work with David Wolpert) 11/01 K. Tumer

Ames Research Center Outline • Introduction to collectives – Definition / Motivation – A naturally occurring example • Illustration of theory of collectives I – Central equation of collectives • Interlude 1: – Autonomous defects problem (Johnson and Challet) • Illustration of theory of collectives II – Aristocrat utility – Wonderful life utility • Interlude 2: – El Farol bar problem: System equilibria and global optima – Collective of rovers: Scientific return maximization • Final thoughts CDCS 2002 K. Tumer 2

Ames Research Center Motivation • Most complex systems, not only can be, but need to be viewed as collectives. Examples include: – Control of a constellation of communication satellites – Routing data/vehicles over a communication network/highway – Dynamic data migration over large distributed databases – Dynamic job scheduling across a (very) large computer grid – Coordination of rovers/submersibles on Mars/Europa – Control of the elements of an amorphous computer/telescope – Construction of parallel algorithms for optimization problems – Autonomous defects Problem CDCS 2002 K. Tumer 3

Collectives Ames Research Center • A Collective is – A (perhaps massive) set of agents; – All of which have “personal” utilities they are trying to achieve; – Together with a world utility function measuring the full system’s performance. • Given that the agents are good at optimizing their personal utilities, the crucial problem is an inverse problem: How should one set (and potentially update) the personal utility functions of the agents so that they “cooperate unintentionally” and optimize the world utility? CDCS 2002 K. Tumer 4

Natural Example: Human Economy Ames Research Center • World utility is GDP – Agents are the individual humans – Agents try to maximize their own “personal” utilities • Design problem is: – How to modify personal utilities of the agents through incentives or regulations (e. g. , tax breaks, SEC regulations against insider trading, antitrust laws) to achieve high GDP? – Note: A. Greenspan does not tell each individual what to do. • Economics hamstrung by “pre-set agents” – No such restrictions for an artificial collective CDCS 2002 K. Tumer 5

Ames Research Center • Outline Introduction to Collectives – Definition / Motivation – A naturally occurring example • Illustration of Theory of Collectives I – Central Equation of Collectives • • Interlude 1: – Autonomous defects problem (Johnson and Challet) Illustration of theory of collectives II – Aristocrat utility – Wonderful life utility Interlude 2: – El Farol bar problem: System equilibria and global optima – Collective of rovers: Scientific return maximization Final thoughts CDCS 2002 K. Tumer 6

Nomenclature Ames Research Center h : an agent z : state of all agents across all time z h, t : state of agent h at time t z ^h, t : state of all agents other than h at time t zt z h , t 1 0 n z ^h , t 4 0 zh 4 CDCS 2002 K. Tumer 7

Ames Research Center Key Concepts for Collectives • Factoredness: Degree to which an agent’s personal utility is aligned with the world utility (e. g. , quantifies “if you get rich, world benefits” concept). • Learnability: Signal-to-noise measure. Quantifies how sensitive an agent’s personal utility function is to a change in its state. • Intelligence: Percentage of states that would have resulted in agent h having a worse utility (e. g. , SATlike percentile concept). CDCS 2002 K. Tumer 8

Central Equation of Collectives Ames Research Center • Our ability to control system consists of setting some parameters s (e. g, agents' goals): Learnability Explore vs. Exploit Factoredness Operations Research Economics Machine Learning – e. G and eg are intelligences for the agents w. r. t the world utility (G) and their personal utilities (g) , respectively CDCS 2002 K. Tumer 9

Ames Research Center • • Outline Introduction to Collectives – Definition / Motivation – A naturally occurring example Illustration of Theory of Collectives I – Central Equation of Collectives • Interlude 1: – Autonomous defects problem (Johnson and Challet) • • • Illustration of Theory of Collectives II – Aristocrat utility – Wonderful life utility Interlude 2: – El Farol bar problem: System equilibria and global optima – Collective of rovers: Scientific return maximization Final thoughts CDCS 2002 K. Tumer 10

Ames Research Center Autonomous Defects Problem • Given a collection of faulty devices, how to choose the subset of those devices that, when combined with each other, gives optimal performance (Johnson & Challet). aj : distortion of component j nk: action of agent k (nk = 0 ; 1) • Collective approach: Identify each agent with a component. • Question: what utility should each agent try to maximize? CDCS 2002 K. Tumer 11

Autonomous Defects Problem (N=100) Ames Research Center CDCS 2002 K. Tumer 12

Autonomous Defects Problem (N=1000) Ames Research Center CDCS 2002 K. Tumer 13

Autonomous Defects Problem: Scaling Ames Research Center CDCS 2002 K. Tumer 14

Ames Research Center • • • Outline Introduction to Collectives – Definition / Motivation – A naturally occurring example Illustration of Theory of Collectives I – Central Equation of Collectives Interlude 1: – Autonomous defects problem (Johnson and Challet) • Illustration of Theory of Collectives II – Aristocrat utility – Wonderful life utility • • Interlude 2: – El Farol bar problem: System equilibria and global optima – Collective of rovers: Scientific return maximization Final thoughts CDCS 2002 K. Tumer 15

Ames Research Center Personal Utility • Recall central equation: Factoredness Learnability • Solve for personal utility g that maximizes learnability, while constrained to the set of factored utilities CDCS 2002 K. Tumer 16

Ames Research Center Aristocrat Utility • One can solve for factored U with maximal learnability, i. e. , a U with good term 2 and 3 in central equation: • Intuitively, AU reflects the difference between the actual G and the average G (averaged over all actions you could take). • For simplicity, when evaluating AU here, we make the following approximation: pi(zh) = CDCS 2002 K. Tumer 1 Number of possible actions for h 17

Ames Research Center Clamping parameter CLhv: replace h’s state (taken to be unary vector) with constant vector v • Clamping creates a new “virtual” worldline • In general v need not be a “legal” state for h • Example: four agents, three actions. Agent h 2 clamps to “average action” vector a = (. 33. 33): • 031 CDCS 2002 K. Tumer 01 091 18

Ames Research Center Wonderful Life Utility • The Wonderful Life Utility (WLU) for h is given by: – Clamping to “null” action (v = 0) removes player from system (hence the name). – Clamping to “average” action disturbs overall system minimally (can be viewed as approximation to AU). – Theorem: WLU is factored regardless of v – Intuitively, WLU measures the impact of agent h on the world • Difference between world as it is, and world without h • Difference between world as it is, and world where h takes average action – WLU is “virtual” operation. System is not re-evolved. CDCS 2002 K. Tumer 19

Ames Research Center • • Outline Introduction to Collectives – Definition / Motivation – A naturally occurring example Illustration of Theory of Collectives I – Central Equation of Collectives Interlude 1: – Autonomous defects problem (Johnson and Challet) Illustration of Theory of Collectives II – Aristocrat utility – Wonderful life utility • Interlude 2: – El Farol bar problem: System equilibria and global optima – Collective of rovers: Scientific return maximization • Final thoughts CDCS 2002 K. Tumer 20

Ames Research Center El Farol Bar Problem • Congestion game: A game where agents share the same action space, and world utility is a function purely of how many agents take each action. • Illustrative Example: Arthur’s El Farol bar problem: – At each time step, each agent decides whether to attend a bar: • If agent attends and bar is below capacity, agent gets reward • If agent stays home and bar is above capacity, agent gets reward – Problem is particularly interesting because rational agents cannot all correctly predict attendance: • If most agents predict attendance will be low and therefore attend, attendance will be high • If most agents predict high attendance and therefore do not attend … CDCS 2002 K. Tumer 21

Ames Research Center Modified El Farol Bar Problem • Each week agents select one of seven nights to attend a bar Attendance for night k at week t Reward for night k at week t Capacity of bar Rt : Reward for week t • Further modifications: – Each week each agent selects two nights to attend bar. –. . . – Each week each agent selects six nights to attend bar. CDCS 2002 K. Tumer 22

Ames Research Center Personal Utility Functions • Two conventional utilities: – Uniform Division (UD): Divide each night’s total reward among all agents that attended that night (the “natural” reward) – Team Game (TG): Total world reward at time t (Rt) • Three collective-based utilities: – WL 0 : WL utility with clamping parameter set to vector of 0 s (world utility minus “world utility without me”) – WL 1 : WL utility with clamping parameter set to vector of 1 s (world utility minus “world utility where I attend every night”) – WL a : WL utility with clamping parameter set to vector of average action (world utility minus “world utility where I do what is “expected of me”) CDCS 2002 K. Tumer 23

Bar Problem: Utility Comparison Ames Research Center (Attend one night, 60 agents, c=3) CDCS 2002 K. Tumer 24

Ames Research Center Typical Daily Bar Attendance (c=6; t=1000 s ; Number of agents = 168) CDCS 2002 K. Tumer 25

Scaling Properties (attend one night) Ames Research Center c=2, 3, 4, 6, 8, 10, 15, respectively CDCS 2002 K. Tumer 26

Performance vs. # of Nights to Attend Ames Research Center 60 agents; c= 3, 6, 8, 10, 12, 15 respectively CDCS 2002 K. Tumer 27

Ames Research Center Collectives of Rovers • Design a collective of autonomous agents to gather scientific information (e. g. , rovers on Mars, submersibles under Europa) – Some areas have more valuable information than others – World Utility: Total importance weighted information collected – Both the individual rovers and the collective need to be flexible so they can adapt to new circumstances – Collective-based payoff utilities result in better performance than more “natural” approaches CDCS 2002 K. Tumer 28

Ames Research Center World Utility • Token value function: – L : Location Matrix for all agents – Lh : Location Matrix agent h – Lh, ta: Location Matrix of agent h at time t, had it taken action a at t-1 – Q: Initial token configuration • World Utility : • Note: Agents’ payoff utilities reduce to figuring out what “L” to use. CDCS 2002 K. Tumer 29

Ames Research Center Payoff Utilities • Selfish Utility : • Team Game Utility : • Collectives-Based Utility (theoretical): • Collectives-Based Utility (practical): CDCS 2002 K. Tumer 30

Utility Comparison in Rover Domain Ames Research Center 100 rovers on a 32 x 32 grid CDCS 2002 K. Tumer 31

Ames Research Center CDCS 2002 Scaling Properties in Rover Domain K. Tumer 32

Summary Ames Research Center • Given a world utility, deploying RL algorithms provides a solution to the distributed design problem. But what utilities does one use? • Theory of collectives shows how to configure and/or update the personal utilities of the agents so that they “unintentionally cooperate” to optimize the world utility • Personal utilities based on collectives successfully applied to many domains (e. g. , autonomous rovers, constellations of communication satellites, data routing, autonomous defects) • Performance gains due to using collectives-based utilities increase with size of problem • A fully fleshed science of collectives would benefit from and have applications to many other sciences CDCS 2002 K. Tumer 33