Social Networks Strong and Weak Ties These slides

  • Slides: 64
Download presentation
Social Networks Strong and Weak Ties These slides are based on chapter 3 of

Social Networks Strong and Weak Ties These slides are based on chapter 3 of Networks, Crowds and Markets, by David Easley and Jon Kleinberg Available online at: http: //www. cs. cornell. edu/home/kleinber/networks-book/

How do people find jobs? 1960 s, Granoveletter interviewed people who recently found jobs.

How do people find jobs? 1960 s, Granoveletter interviewed people who recently found jobs. Many people found their jobs through personal contacts Usually, through acquaintances, not through friends Why? We come back to this soon, after we explore social networks a bit more

How do new links develop in a social network Principle of Triadic Closure: If

How do new links develop in a social network Principle of Triadic Closure: If 2 people have a friend in common, then there is increased likelihood that they will become friends in the future G C B F G A E F D C B Closing a triangle Why do you think this naturally occurs? A E D

Measuring Triadic Closure The Clustering Coefficient of a node A is the probability that

Measuring Triadic Closure The Clustering Coefficient of a node A is the probability that 2 random friends of A are connected = the fraction of pairs of A’s friends that are connected G C B F What is A’s clustering coefficient here? A E D

Bridges An edge from A to B is called a bridge if deleting this

Bridges An edge from A to B is called a bridge if deleting this edge will cause A and B to be in different components of the graph Why are such social ties important in the real world? Do you think that bridges are common? C A D E B

Is the edge (A, B) still a bridge? Is the edge (A, B) still

Is the edge (A, B) still a bridge? Is the edge (A, B) still important? J G K F H C A D E B

Local Bridges An edge from A to B is called a local bridge if

Local Bridges An edge from A to B is called a local bridge if A and B have no friends in common J G K F H C A D E B Can you find all bridges? Can you find all local bridges?

Strong and Weak Ties In life (and also in Facebook) some friends are stronger

Strong and Weak Ties In life (and also in Facebook) some friends are stronger and some are weaker Suppose we color weak edges red and strong edges green J G K F H C A D E B

Strong Triadic Closure Node A violates the strong triadic closure property if it has

Strong Triadic Closure Node A violates the strong triadic closure property if it has strong ties to nodes B and C and there is no edge at all between B and C. Node A satisfies the strong triadic closure property if it does not violate it. J G K F H C A D E B Which nodes violate the strong triadic closure property?

Local Bridges and Weak Ties Local notion: edges can be strong or weak Global

Local Bridges and Weak Ties Local notion: edges can be strong or weak Global notion: edges can be local bridges or not Claim: If node A satisfies the strong triadic closure property and has at least 2 strong ties, then any local bridge involving A is a weak tie Proof: ?

Are Local Bridges Usually Weak Ties in Real life? Analyzed in a network of

Are Local Bridges Usually Weak Ties in Real life? Analyzed in a network of cell-phone calls An edge between two cell-phone if they made calls in both directions over an 18 -week period Strength defined as a function of the number of minutes spend on calls Real local bridges are not common. Consider “almost” local bridges The neighborhood overlap of an edge (A, B) is the number of nodes who are neighbors of A and B, divided by the total number of neighbors of A, B (= Jaccard Coefficient)

Neighborhood Overlap: Example What is the neighborhood overlap of (A, F)? J G K

Neighborhood Overlap: Example What is the neighborhood overlap of (A, F)? J G K F H C A D E B When is an edge a local bridge?

What do you expect… Is the relationship between the strength of an edge and

What do you expect… Is the relationship between the strength of an edge and its neighborhood overlap? Results on Cell. Phone Network

Tie Strength on Facebook How strong are the friendships declared on Facebook? Researchers at

Tie Strength on Facebook How strong are the friendships declared on Facebook? Researchers at Facebook (Marlow, et. Al. ) categorized edges: An edge represents mutual communication if users sent messages to each other within observation period An edge represents one-way communication if user sent message to other (regardless of whether reciprocated) An edge represents maintained relationship if user followed information about friend An edge can belong to several categories

Example Facebook User Where is triadic closure more common? Why?

Example Facebook User Where is triadic closure more common? Why?

Average number of edges Even users with many friends communicate / maintain relationships, with

Average number of edges Even users with many friends communicate / maintain relationships, with few

Roles of Nodes So far we have discussed different types of edges Now, we

Roles of Nodes So far we have discussed different types of edges Now, we consider the roles that nodes can take What advantages does A have in this network? What advantages does B have in this network?

Regions Can we recognize tightly knight regions (e. g. , communities) in a social

Regions Can we recognize tightly knight regions (e. g. , communities) in a social network? This is a graph partitioning problem Co-authorship graph

Approaches to Graph Partitioning Ideas?

Approaches to Graph Partitioning Ideas?

Approaches to Graph Partitioning Divisive Methods: Identify and remove “spanning links”. Network will fall

Approaches to Graph Partitioning Divisive Methods: Identify and remove “spanning links”. Network will fall apart into large pieces. Continue on. Agglomerative Methods: Find closely knit regions forming chunks of the graph. Combine closely related chunks. Continue on. There are many algorithms. We will discuss the Girvan-Newman method.

Example

Example

Betweenness We discussed betweenness earlier as a measure of importance for nodes Now, we

Betweenness We discussed betweenness earlier as a measure of importance for nodes Now, we define the betweenness of an edge For every A, B connected by a path, imagine there being one unit of “flow” along the edges from A to B The “flow” divides itself evenly along all shortest paths If there are k shortest paths, then 1/k units of “flow” pass on each path The betweenness of a node is the total amount of flow it carries, for all pairs of nodes using this edge

Example What is the betweenness of edge (7, 8)? What is the betweenness of

Example What is the betweenness of edge (7, 8)? What is the betweenness of edge (3, 7)? What is the betweenness of edge (1, 3)? What is the betweenness of edge (1, 2)?

The Givan-Newman Method Find edge(s) with highest betweenness value and remove from graph If

The Givan-Newman Method Find edge(s) with highest betweenness value and remove from graph If this causes the graph to disconnect, this is the first level of regions in the partitioning Recalculate betweenness and again remove edge(s) with highest betweenness value If breaks components into smaller components, these are nested regions Continue until all edges are removed

Example

Example

Computing Betweenness To compute the betweenness of an edge we consider the graph from

Computing Betweenness To compute the betweenness of an edge we consider the graph from the perspective of a single node A We compute how the flow from A to all others is distributed over the edges Afterwards we can simply add up flow computed for each edge from the perspective of each node (Finally, divide by 2 since every pair of nodes is considered twice)

Computing Betweenness (2) 3 steps to computation: 1. Perform Breadth-First Search from A to

Computing Betweenness (2) 3 steps to computation: 1. Perform Breadth-First Search from A to all other nodes 2. Determine number of shortest paths from A to each other node 3. Based on (2) compute the flow from A to all other nodes that use each edge

Step 1: Breadth First Search

Step 1: Breadth First Search

Step 2: Counting Number of Shortest Paths

Step 2: Counting Number of Shortest Paths

Step 3: Computing F First consider K Only 1 unit of flow arrives at

Step 3: Computing F First consider K Only 1 unit of flow arrives at K An equal number of shortest paths enter K from I and from J, so the flow is divided evenly

Step 3: Computing F Now consider I There is 1 unit of flow for

Step 3: Computing F Now consider I There is 1 unit of flow for I and 0. 5 a unit that flows through I to reach K So, in total 1. 5 units of flow reach I Since 2/3 of the paths to I are via F, 2/3 * 1. 5 = 1 unit of flow is on (F, I) Similarly, 0. 5 unit of flow on (G, I) Continue on in the same manner

Social Networks in Their Surrounding Contexts These slides are based on chapter 4 of

Social Networks in Their Surrounding Contexts These slides are based on chapter 4 of Networks, Crowds and Markets, by David Easley and Jon Kleinberg Available online at: http: //www. cs. cornell. edu/home/kleinber/networks-book/

Beyond Nodes and Edges So far, we have considered a social network as only

Beyond Nodes and Edges So far, we have considered a social network as only a set of nodes and edges However, these nodes represent people (or other types of entities) that have many additional properties How do these properties affect the nature and structure of a network?

Homophily Principle of Homophily: People tend to be similar to their friends Think about

Homophily Principle of Homophily: People tend to be similar to their friends Think about it: Does triadic closure encourage or discourage homophily?

Measuring Homophily How can we discern if a network exhibits homophily with respect to

Measuring Homophily How can we discern if a network exhibits homophily with respect to a given property, e. g. , Do people tend to be friends with others of the same race in the network? Do people tend to be friends with others of the same social status in the network? Consider a property which has two values Each person has exactly one of these two values, e. g. , Male or female

Measuring Homophily Suppose that a fraction p of people in the network are male,

Measuring Homophily Suppose that a fraction p of people in the network are male, while a fraction q = 1 -p are female If nodes were randomly assigned “male” and “female” then with likelihood 2 pq, an edge is cross-gender. Homophily Test: If the faction of cross-gender edges is significantly less than 2 pq, then there is evidence for homophily (Significant = Statistically Significant)

Example: Does this network exhibit homophily?

Example: Does this network exhibit homophily?

What Causes Homophily? Is Obesity Contagious? Selection: People choose friends similar to them People

What Causes Homophily? Is Obesity Contagious? Selection: People choose friends similar to them People live in areas of homogenous nature Etc Social Influence: People are influenced by their peers in their behavior Join in friends interests (e. g. , sports) Join in friends behavior (e. g. , taking drugs, drinking) Why does it matter?

Social Networks Positive and Negative Relationships These slides are based on chapter 5 of

Social Networks Positive and Negative Relationships These slides are based on chapter 5 of Networks, Crowds and Markets, by David Easley and Jon Kleinberg Available online at: http: //www. cs. cornell. edu/home/kleinber/networks-book/

Structural Balance Up until now, we assumed all relationships in the network were positive

Structural Balance Up until now, we assumed all relationships in the network were positive However, we may have a negative relationships in real life Label edges with “+” and “-” Do such networks exhibit certain characteristics? Do local effects have global consequences?

Simple Model We will start by assuming that everyone knows everyone Network is a

Simple Model We will start by assuming that everyone knows everyone Network is a clique (i. e. , complete) Every edge is labeled + or – Motivating Examples: Small groups of people Countries

Relationships Between 3 People A + B A + + C + Balanced B

Relationships Between 3 People A + B A + + C + Balanced B B A - - C - Not Balanced A + + C B - C Which types of relationships are more likely?

Structural Balance Having A, B, C who are all friends is natural Occurs also

Structural Balance Having A, B, C who are all friends is natural Occurs also due to triadic closure A + B + + C Having A, B friends and both dislike C is natural Have a common enemy (e. g. , Syria, Lebanon, Israel) A + B - C

Structural Balance (cont) Having A, B friends, and A, C friends, but B and

Structural Balance (cont) Having A, B friends, and A, C friends, but B and C dislike each other is unstable Either A will try to get B, C to be friends or A B will try to turn A against C or + + C will try to turn A against B B C - Having A, B, C all enemies is unstable Two will try to team up against the third A B - C

Structural Balance (cont). What happens for larger networks? Structural Balance Property: A complete graph

Structural Balance (cont). What happens for larger networks? Structural Balance Property: A complete graph is structurally balanced if, for every set of 3 nodes A, B, C, either all edges between them are labeled “+” or exactly one edge is labeled “-”

Example: Are these Balanced? Network 1 Network 2

Example: Are these Balanced? Network 1 Network 2

What do Balanced Networks Look Like? 1. All nodes are friends 2. Nodes can

What do Balanced Networks Look Like? 1. All nodes are friends 2. Nodes can be divided into 2 sets X, Y, such that: Everyone in X likes each other Everyone in Y likes each other All nodes from X dislike all nodes from Y (and viceversa 3. Any other options?

Balance Theorem The Balance Theorem: If a complete graph is balanced then either all

Balance Theorem The Balance Theorem: If a complete graph is balanced then either all nodes are friends, or they can be divided into two groups X, Y such that every pair of nodes in X likes each other, every pair of nodes in Y likes each other and everyone in X dislikes everyone in Y Proof: Suppose that G is balanced. We will show it satisfies the required properties. If G has only + edges, then is obvious Suppose G has at least one - edge

Proof (cont) Must find sets X, Y Let A be a node in the

Proof (cont) Must find sets X, Y Let A be a node in the graph. Since the graph is complete, there is an edge from A to every other node Define X = all friends of A Define Y = all enemies of A We must show that 1. Every 2 nodes in X are friends 2. Every 2 nodes in Y are friends 3. Every node in X is an enemy of every node in Y

Proof (cont) Illustration of why all 3 required properties hold

Proof (cont) Illustration of why all 3 required properties hold

Is All Imbalance Equally Unlikely? Can be argued that while this arrangement is hard

Is All Imbalance Equally Unlikely? Can be argued that while this arrangement is hard to sustain for a long time, A + B + C - A This arrangement is more sustainable B - C

Weak Structural Balance Property: A complete graph is structurally balanced if, there is no

Weak Structural Balance Property: A complete graph is structurally balanced if, there is no set of three nodes A, B, C with exactly 2 positive edges and one negative edge What do weakly balanced networks look like?

Any Number of Sets Possible

Any Number of Sets Possible

Weak Structural Balance Characterization The Weak Balance Theorem: If a complete graph is weakly

Weak Structural Balance Characterization The Weak Balance Theorem: If a complete graph is weakly balanced then its nodes can be divided into groups such that every two nodes in the same group are friends and every two nodes in different groups are enemies Proof: Pick a node A. Let X be the set containing A and all its friends. Observe that: Every two nodes in A must be friends with each other Every node in A must be an enemy of every node out of A Remove the set X from the graph and continue in the same manner to find additional groups

Other Generalizations of Balance Graphs that are not complete? What if not everyone knows

Other Generalizations of Balance Graphs that are not complete? What if not everyone knows everyone? Graphs that are approximately balanced? What if most triangles are balanced, but not all? (Not Discussed – See book) Note that our discussions will use the original definition of balance, not the weak version!

General Networks Every edge is labeled “+” or “-” Can be nodes A, B

General Networks Every edge is labeled “+” or “-” Can be nodes A, B such that no edge exists between A and B How should we define balance?

Local Definition Graph is structurally balanced, if it is possible to “fillin” missing values,

Local Definition Graph is structurally balanced, if it is possible to “fillin” missing values, and get a structurally balanced network

Global Definition Graph is structurally balanced, if it is possible to divide up nodes

Global Definition Graph is structurally balanced, if it is possible to divide up nodes into 2 sets X, Y such that people in X who know each other are friends people in Y who know each other are friends people in X who know people in Y are enemies Equivalent? YES!

Complexity How can you prove that a graph is balanced? How can you prove

Complexity How can you prove that a graph is balanced? How can you prove that it is not? What makes a graph unbalanced? X 1 Oops! + X X 5 2 - Y 4 + 3 Y

What Makes a Graph Unbalanced? Claim: A graph is unbalanced if and only if

What Makes a Graph Unbalanced? Claim: A graph is unbalanced if and only if it contains a cycle with an odd number of negative edges Proof: We will try to search for sets X, Y that form a balanced division of the graph We will show that either we succeed in finding X, Y or we find a cycle with an odd number of negative edges

Proof: Step 1 Divide the graph into components that must belong in the same

Proof: Step 1 Divide the graph into components that must belong in the same sets – the connected components of the Call these super graph, if “-” edges were removed -nodes Only negative edges between super-nodes

Proof: Step 1 (cont) If a super-node contains a negative edge between two of

Proof: Step 1 (cont) If a super-node contains a negative edge between two of its nodes, then there is a cycle with an odd number of “-” edges We can assume that there are only “+” edges within the supernode

Proof: Step 2 Now, collapse each super-node into a single node The graph derived

Proof: Step 2 Now, collapse each super-node into a single node The graph derived has only “-” edges Then, choose a collapsed node arbitrarily and put it (and all the nodes it contains into X) Put all its neighbors into Y Put all neighbors of neighbors into X, etc At the end, either we have divided the nodes into X, Y as required or we have found a cycle with an odd number of “-” edges in the collapsed graph. Can expand to find a cycle in the original graph

Illustration of Step 2

Illustration of Step 2