Social Network Analysis 1 What is Network Analysis

  • Slides: 48
Download presentation
Social Network Analysis 1

Social Network Analysis 1

What is Network Analysis? n Social network analysis is a method by which one

What is Network Analysis? n Social network analysis is a method by which one can analyze the connections across individuals or groups or institutions. That is, it allows us to examine how political actors or institutions are interrelated. n Focuses on interaction, not on individuals n Examine how the configuration of networks influences how individuals and groups, organizations, or systems function. 2

What is Network Analysis? n Applied across disciplines: n Social networks n political networks

What is Network Analysis? n Applied across disciplines: n Social networks n political networks n electrical networks n transportation networks, and so on. 3

History of (Social) Network Analysis n Early research in network analysis is found in

History of (Social) Network Analysis n Early research in network analysis is found in educational psychology, and studies of child development. n 1922: Almack asked children in a California elementary school to identify the classmates with whom they wanted as playmates. n Correlated IQs and examined the hypothesis that choices were homophilous. 4

More Early History n 1926, Wellman: Recorded pairs of individuals who were observed as

More Early History n 1926, Wellman: Recorded pairs of individuals who were observed as being together frequently. n Recorded data, including the student’s height, grades, IQ, score on a physical coordination test, and degree of introversion versus extraversion (based on teacher’s ratings). n Examined whether interaction was homophilous. 5

More Early History n In 1933, the New York Times reported on the new

More Early History n In 1933, the New York Times reported on the new science of “psychological geography” which “aims to chart the emotional currents, cross-currents and under-currents of human relationships in a community”. n Jacob Moreno analyzed the interconnections across 500 girls in the State Training School for Girls, and the interconnections of students within two NYC schools. n Many relationships were non-reciprocal—and that many individuals were isolated. 6

Other Advances n 1935, Theodore Newcomb: Bennington college women were exposed to the relatively

Other Advances n 1935, Theodore Newcomb: Bennington college women were exposed to the relatively liberal students and faculty, they became more liberal. n 1950, Festinger: Influence of dorm room location on friendships n Developments in the last few decades include much attention paid to several concepts, including “the strength of weak ties”, and “small worlds”. 7

Weak Ties n Strong ties are edges between two people that have common friends

Weak Ties n Strong ties are edges between two people that have common friends n Other edges are called weak ties 8

The Strength of Weak Ties n Granovetter’s “The Strength of Weak Ties” argued that

The Strength of Weak Ties n Granovetter’s “The Strength of Weak Ties” argued that “weak ties” could actually be more advantageous n The presence of weak ties often reduced path lengths (distance) between any two individuals— which led to quicker diffusion of information 9

Small world phenomenon: Milgram’s experiment MA NE Source: undetermined 10

Small world phenomenon: Milgram’s experiment MA NE Source: undetermined 10

Small world phenomenon: Milgram’s experiment Instructions: Given a target individual (stockbroker in Boston), pass

Small world phenomenon: Milgram’s experiment Instructions: Given a target individual (stockbroker in Boston), pass the message to a person you correspond with who is “closest” to the target. Outcome: 20% of initiated chains reached target average chain length = 6. 5 n “Six degrees of separation” 11

Why is Network Analysis Useful? (Some Examples) n New product to market. Want to

Why is Network Analysis Useful? (Some Examples) n New product to market. Want to notify k people and have information disseminate. Which k people? n Company wants to down-size. Who should not be fired? n Friend Recommendation? n John is looking for a dentist. Who should be recommended? n Want to put together a committee to launch a conference. Who should we choose? 12

These slides are adapted from slides given in a University of Michigan course on

These slides are adapted from slides given in a University of Michigan course on network analysis by Lada Adamic Unless otherwise noted, the content of this course material is licensed under a Creative Commons Attribution 3. 0 License. http: //creativecommons. org/licenses/by/3. 0/ Copyright 2008, Lada Adamic You assume all responsibility for use and potential liability associated with any use of the material. Material contains copyrighted content, used in accordance with U. S. law. Copyright holders of content included in this material should contact open. michigan@umich. edu with any questions, corrections, or clarifications regarding the use of content. The Regents of the University of Michigan do not license the use of third party content posted to this site unless such a license is specifically granted in connection with particular content objects. Users of content are responsible for their compliance with applicable law. Mention of specific products in this recording solely represents the opinion of the speaker and does not represent an endorsement by the University of Michigan. For more information about how to cite these materials visit http: //michigan. educommons. net/about/terms-of-use. 13

Network basics 14

Network basics 14

What are networks? n Networks are collections of points joined by lines. “Network” ≡

What are networks? n Networks are collections of points joined by lines. “Network” ≡ “Graph” node edge points lines vertices edges, arcs math nodes links computer science sites bonds physics actors ties, relations sociology 15

Network elements: edges n Directed A B n A likes B, A gave a

Network elements: edges n Directed A B n A likes B, A gave a gift to B, etc n Undirected A B or A – B n A and B like each other, are siblings, are coauthors, etc. n Edge attributes n weight (e. g. frequency of communication) n ranking (best friend, second best friend…) n type (friend, relative, co-worker) 16

Edge weights can have positive or negative values n One gene activates/inhibits another n

Edge weights can have positive or negative values n One gene activates/inhibits another n One person trusting/distrusting another (how to propagate)? Source: undetermined 17

Characterizing networks: Who is most central? ? 18

Characterizing networks: Who is most central? ? 18

Characterizing networks: Is everything connected? 19

Characterizing networks: Is everything connected? 19

Network metrics: size of giant component n if the largest component encompasses a significant

Network metrics: size of giant component n if the largest component encompasses a significant fraction of the graph, it is called the giant component 20

Characterizing networks: How far apart are things? 21

Characterizing networks: How far apart are things? 21

Network metrics: shortest paths n Shortest path (also called a geodesic path) n The

Network metrics: shortest paths n Shortest path (also called a geodesic path) n The shortest sequence of links connecting two nodes B n Not always unique A C n. A – E – B – C n. A – E – D – C E D n Diameter: the largest geodesic distance in the graph n What is the diameter of this graph? 22

Characterizing networks: How dense are they? 23

Characterizing networks: How dense are they? 23

Network metrics: graph density n How many possible edges? n Directed graph: emax =

Network metrics: graph density n How many possible edges? n Directed graph: emax = n * (n-1) n Undirected graph: emax = n * (n-1)/2 n Density = e/ emax n What is the density of this graph? 24

Network Centrality 25

Network Centrality 25

Node Centrality n Which nodes are most important (central)? n Is there one ultimate

Node Centrality n Which nodes are most important (central)? n Is there one ultimate answer? n Depends on context n Measuring centrality n Local measure: degree n Relative to rest of network n How evenly is centrality distributed among nodes? 26

centrality: who’s important based on their network position n In each of the following

centrality: who’s important based on their network position n In each of the following networks, X has higher centrality than Y according to a particular measure indegree outdegree betweenness closeness 27

Degree centrality (undirected) He who has many friends is most important. When is the

Degree centrality (undirected) He who has many friends is most important. When is the number of connections a good centrality measure? o people who will do favors for you o people you can talk to / have a beer with 28

degree: normalized degree centrality divide by the max. possible, i. e. (N-1) 29

degree: normalized degree centrality divide by the max. possible, i. e. (N-1) 29

Centralization: how equal are the nodes? How much variation is there in the centrality

Centralization: how equal are the nodes? How much variation is there in the centrality scores among the nodes? Freeman’s general formula for centralization: maximum value in the network 30

degree centralization examples CD = 0. 167 CD = 1. 0 CD = 0.

degree centralization examples CD = 0. 167 CD = 1. 0 CD = 0. 167 ? 31

when degree isn’t everything In what ways does degree fail to capture centrality in

when degree isn’t everything In what ways does degree fail to capture centrality in the following graphs? 32

In what contexts may degree be insufficient to describe centrality? n ability to broker

In what contexts may degree be insufficient to describe centrality? n ability to broker between groups n likelihood that information originating anywhere in the network reaches you… 33

betweenness: another centrality measure n Intuition: how many pairs of individuals would have to

betweenness: another centrality measure n Intuition: how many pairs of individuals would have to go through you in order to reach one another in the minimum number of hops? n who has higher betweenness, X or Y? Y X 34

Betweenness centrality: definition gjk = the number of geodesics connecting jk gjk(i) = the

Betweenness centrality: definition gjk = the number of geodesics connecting jk gjk(i) = the number of them that node i is on. Usually normalized by: number of pairs of vertices excluding the vertex itself adapted from a slide by James Moody 35

Example Lada’s facebook network: nodes are sized by degree, and colored by betweenness. 36

Example Lada’s facebook network: nodes are sized by degree, and colored by betweenness. 36

Betweenness example (continued) Can you spot nodes with high betweenness but relatively low degree?

Betweenness example (continued) Can you spot nodes with high betweenness but relatively low degree? Explain how this might arise. What about high degree but relatively low betweenness? 37

Betweenness on toy networks n Non-normalized version: A B C D E n A

Betweenness on toy networks n Non-normalized version: A B C D E n A lies between no two other vertices n B lies between A and 3 other vertices: C, D, and E n C lies between 4 pairs of vertices (A, D), (A, E), (B, D), (B, E) n note that there are no alternate paths for these pairs to take, so C gets full credit 38

betweenness on toy networks n non-normalized version: 39

betweenness on toy networks n non-normalized version: 39

betweenness on toy networks n non-normalized version: ? 40

betweenness on toy networks n non-normalized version: ? 40

betweenness on toy networks n non-normalized version: n why do C and D each

betweenness on toy networks n non-normalized version: n why do C and D each have C ? A E B betweenness 1? n They are both on shortest paths for pairs (A, E), and (B, E), and so must share credit: n ½+½ = 1 n Can you figure out why E has betweenness 0. 5? n What is the betweenness of B? D 41

Closeness: another centrality measure n What if it’s not so important to have many

Closeness: another centrality measure n What if it’s not so important to have many direct friends? n Or be “between” others n But one still wants to be in the “middle” of things, not too far from the center 42

Closeness centrality: definition Closeness is based on the length of the average shortest path

Closeness centrality: definition Closeness is based on the length of the average shortest path between a vertex and all vertices in the graph Closeness Centrality: Normalized Closeness Centrality 43

closeness centrality: toy example ? A B C D E 44

closeness centrality: toy example ? A B C D E 44

closeness centrality: more toy examples 45

closeness centrality: more toy examples 45

Complexity n Degree: Degree of node / N-1 n Betweenness n Closeness 46

Complexity n Degree: Degree of node / N-1 n Betweenness n Closeness 46

Efficiently Computing Betweenness n How do you compute gjk(i)? n LEMMA 1 (Bellman criterion):

Efficiently Computing Betweenness n How do you compute gjk(i)? n LEMMA 1 (Bellman criterion): A vertex i lies on a shortest path between vertices j, k, if and only if n d(j, k) = d(j, i) + j(i, k) n Therefore: n gjk(i) = 0 gji * gik if d(j, k) < d(j, i) + j(i, k) otherwise 47

Efficiently Computing Betweenness n How do you compute gjk? n Define Pj(i) to be

Efficiently Computing Betweenness n How do you compute gjk? n Define Pj(i) to be the set of all predecessors of i on a shortest path from j n Then, gjk = Sumi in P (k) gji j n We can BFS to count paths n Left as exercise n We do something like this later when we discuss betweenness of edges 48