ConstantFactor Approximation Algorithms for Identifying Dynamic Communities Chayant

  • Slides: 25
Download presentation
Constant-Factor Approximation Algorithms for Identifying Dynamic Communities Chayant Tantipathananandh with Tanya Berger-Wolf

Constant-Factor Approximation Algorithms for Identifying Dynamic Communities Chayant Tantipathananandh with Tanya Berger-Wolf

Social Networks These are snapshots and networks change over time

Social Networks These are snapshots and networks change over time

Dynamic Networks 2 3 t=2 1 2 3 5 4 5 2 3 4

Dynamic Networks 2 3 t=2 1 2 3 5 4 5 2 3 4 1 … 5 2 5 5 4 2 3 5 2 1 3 1 • Interactions occur in the form of disjoint groups • Groups are not communities 5 2 t=2 4 4 2 1 1 3 4 Aggregated network t=1 2 3 1 … 1 t=1

Communities • What is community? “Cohesive subgroups are subsets of actors among whom there

Communities • What is community? “Cohesive subgroups are subsets of actors among whom there are relatively strong, direct, intense, frequent, or positive ties. ” [Wasserman & Faust 1994] • Dynamic Community Identification – – – Graph. Scope [Sun et al 2005] Metagroups [Berger-Wolf & Saia 2006] Dynamic Communities [TBK 2007] Clique Percolation [Palla et al 2007] Facet. Net [Lin et al 2009] Bayesian approach [Yang et al 2009]

Ship of Theseus from Wikipedia “The ship … was preserved by the Athenians …,

Ship of Theseus from Wikipedia “The ship … was preserved by the Athenians …, for they took away the old planks as they decayed, putting in new and stronger timber in their place, insomuch that this ship became a standing example among the philosophers, for the logical question of things that grow; one side holding that the ship remained the same, and the other contending that it was not the same. ” [Plutarch, Theseus] Jeannot's knife “has had its blade changed fifteen times and its handle fifteen times, but is still the same knife. ” [French story]

Ship of Theseus Individual Cost for parts changing never change identity identities …

Ship of Theseus Individual Cost for parts changing never change identity identities …

Ship of Theseus Costs Identity for changes visiting and to match beingthe absent group

Ship of Theseus Costs Identity for changes visiting and to match beingthe absent group …

Approach

Approach

Community = Color Valid coloring: In each time step, different groups have different colors.

Community = Color Valid coloring: In each time step, different groups have different colors.

Interpretation Group color: How does community c interact at time t?

Interpretation Group color: How does community c interact at time t?

Interpretation 2 2 2 1 1 2 2 Individual color: Who belong to community

Interpretation 2 2 2 1 1 2 2 Individual color: Who belong to community c at time t? 1 1 1

Social Costs: Conservatism 2 α α 2 2 2 α 2 Switching cost α

Social Costs: Conservatism 2 α α 2 2 2 α 2 Switching cost α Absence cost β 1 2 2 α 2 Visiting cost β 2

Social Costs: Loyalty 3 2 3 3 β 1 β 1 1 1 Switching

Social Costs: Loyalty 3 2 3 3 β 1 β 1 1 1 Switching cost α Absence cost β 1 2 β 1 3 β 1 Visiting cost β 2

Social Costs: Loyalty β 2 3 β 2 2 Switching cost α Absence cost

Social Costs: Loyalty β 2 3 β 2 2 Switching cost α Absence cost β 1 3 2 Visiting cost β 2

Problem Complexity • Minimizing total cost is hard NP-complete and APX-hard [with Berger-Wolf and

Problem Complexity • Minimizing total cost is hard NP-complete and APX-hard [with Berger-Wolf and Kempe 2007] • Constant-Factor Approximation [details in paper] • Easy special case If no missing individuals and 2α ≤ β 2 , then simply weighted bipartite matching [details in paper]

Greedy Approximation No visiting or absence and minimizing switching time

Greedy Approximation No visiting or absence and minimizing switching time

Greedy Approximation 3 4 2 ≈ maximizing path coverage 3 Greedy alg guarantees 7

Greedy Approximation 3 4 2 ≈ maximizing path coverage 3 Greedy alg guarantees 7 2 No visiting or absence and minimizing switching max{2, 2α/β 1, 4α/β 2} in α, β 1, β 2, independent of input size 3 4 time 3 Improvement by dynamic programming

Southern Women Data Set [DGG 1941] • 18 individuals, 14 time steps • Collected

Southern Women Data Set [DGG 1941] • 18 individuals, 14 time steps • Collected in Natchez, MS, 1935 aggregated network

Ethnography [DGG 1941] Core note: columns not ordered by time

Ethnography [DGG 1941] Core note: columns not ordered by time

Optimal Communities individuals time ethnography Core all costs equal white circles = unknown

Optimal Communities individuals time ethnography Core all costs equal white circles = unknown

Approximate Optimal time Core ethnography Core

Approximate Optimal time Core ethnography Core

Approximation Power 0, 80 OPT≥ Greedy Grevys Greedy+DP 0, 70 300, 00 OPT≥ Greedy+DP

Approximation Power 0, 80 OPT≥ Greedy Grevys Greedy+DP 0, 70 300, 00 OPT≥ Greedy+DP Greedy Plains ≤Guarantee 1, 60 OPT≥ ≤Guarantee 1, 40 250, 00 0, 60 ≤Guarantee 1, 20 0, 40 0, 30 200, 00 total cost (k) 0, 50 total cost (k) Greedy Onagers Greedy+DP 150, 00 100, 00 0, 20 1, 00 0, 80 0, 60 0, 40 50, 00 0, 10 0, 00 0, 20 0, 00 1/3 1/2 1/1 2/1 switch/visit 28 inds, 44 times 3/1 0, 00 1/3 1/2 1/1 2/1 switch/visit 3/1 29 inds, 82 times 1/3 1/2 1/1 2/1 switch/visit 3/1 313 inds, 758 times

Approximation Power 7, 00 OPT≥ 14, 00 OPT≥ 80, 00 OPT≥ 6, 00 12,

Approximation Power 7, 00 OPT≥ 14, 00 OPT≥ 80, 00 OPT≥ 6, 00 12, 00 70, 00 5, 00 10, 00 Greedy+DP Greedy ≤Guarantee Haggle Infocom (264) 4, 00 3, 00 8, 00 6, 00 50, 00 40, 00 30, 00 2, 00 4, 00 1, 00 2, 00 10, 00 1/3 1/2 1/1 2/1 switch/visit 3/1 41 inds, 418 times Greedy+DP Greedy ≤Guarantee Reality Mining 60, 00 total cost (k) Greedy+DP Greedy ≤Guarantee Haggle Infocom (41) 20, 00 1/3 1/2 1/1 2/1 switch/visit 3/1 264 inds, 425 times 1/3 1/2 1/1 2/1 switch/visit 3/1 96 inds, 1577 times

Conclusions • Identity of objects that change over time (Ship of Theseus Paradox) •

Conclusions • Identity of objects that change over time (Ship of Theseus Paradox) • Formulate an optimization problem • Greedy approximation – Fast – Near-optimal • Future Work – Algorithm with guarantee not depending on α, β 1, β 2 – Network snapshots instead of disjoint groups

Thank You NSF grant, KDD student travel award David Kempe Jared Saia Chayant Mayank

Thank You NSF grant, KDD student travel award David Kempe Jared Saia Chayant Mayank Lahiri Arun Maiya Ilya Fischoff Tanya Berger-Wolf Habiba Saad Sheikh Dan Rubenstein Siva Sundaresan Robert Grossman Anushka Anand Rajmonda Sulo