Selected Topics in Data Networking Explore Social Networks

  • Slides: 44
Download presentation
Selected Topics in Data Networking Explore Social Networks: Cohesion

Selected Topics in Data Networking Explore Social Networks: Cohesion

Introduction to Cohesion n Social Network Analysis: q Investigating who is related and who

Introduction to Cohesion n Social Network Analysis: q Investigating who is related and who is not. n n Why are some people or organizations related, whereas others are not? People who match on social characteristics will interact more often and people who interact regularly will foster a common attitude or identity. 2

Introduction to Cohesion n Social interaction q Basis for solidarity, shared norms, identity, and

Introduction to Cohesion n Social interaction q Basis for solidarity, shared norms, identity, and collective behavior n n People who interact intensively are likely to consider themselves a social group. Expecting similar people to interact a lot, at least more often than with dissimilar people. q This phenomenon is called homophily: love of the same (the tendency of individuals to associate and bond with similar others) n Does the homophily principle work? 3

Introduction to Cohesion n Study in the Turrialba region, which is a rural area

Introduction to Cohesion n Study in the Turrialba region, which is a rural area in Costa Rica (Latin America). n Visual impression of the kin visits network and the family–friendship groupings, which are identified by the colors and numbers within the vertices 4

Meaning: Cohesion n Cohesion means a social network contains many ties. q Community cohesion

Meaning: Cohesion n Cohesion means a social network contains many ties. q Community cohesion refers to the aspect of togetherness and bonding exhibited by members of a community, the “glue” that holds a community together. n n More ties between people yield a tighter structure, which is more cohesive. The density of a network captures this idea. Source: https: //digestiblepolitics. wordpress. com/2013/01/04/the-importance-of-community-cohesion-in-society-in-2013/ 5

Meaning: Cohesion n Review q n Multiple lines between vertices and higher line values

Meaning: Cohesion n Review q n Multiple lines between vertices and higher line values indicate more cohesive ties. Density = the number of edges divided by the number possible. q q If self-loops are excluded, then the number possible is n(n-1)/2. If self-loops are allowed, then the number possible is n(n+1)/2. 6

Cohesion: Indicated by Density q q In the kin visiting relation network, density is

Cohesion: Indicated by Density q q In the kin visiting relation network, density is 0. 045, which means that only 4. 5 percent of all possible arcs are present. Density is inversely related to network size: n n The larger the social network, the lower the density because the number of possible lines increases rapidly with the number of vertices, whereas the number of ties which each person can maintain is limited. Discussion: Why? q Network density is not very useful 7

Cohesion: Indicated by Degree n The number of ties in which each vertex is

Cohesion: Indicated by Degree n The number of ties in which each vertex is involved. q Degree of a vertex. n n Vertices with high degree are more likely to be found in dense sections of the network. Review (Undirected Graph) Cohesion: Comparing between Density and Degree 8

Cohesion: Indicated by Degree n A higher degree of vertices yields a denser network

Cohesion: Indicated by Degree n A higher degree of vertices yields a denser network q n because vertices entertain more ties. Average degree of all vertices: Measuring the structural cohesion of a network. q This is a better measure of overall cohesion than density n It does not depend on network size, so average degree can be compared between networks of different sizes. 9

Cohesion: Indicated by Degree n NOTE q Directed Graph: the sum of the indegree

Cohesion: Indicated by Degree n NOTE q Directed Graph: the sum of the indegree and the outdegree of a vertex does not necessarily equal the number of its neighbors 10

Cohesion: Indicated by Component n n Vertices with a degree of one or higher

Cohesion: Indicated by Component n n Vertices with a degree of one or higher are connected to at least one neighbor, so they are not isolated. If the network is cut up in pieces. q Isolated sections of the network may be regarded as cohesive subgroups n q because the vertices within a section are connected, whereas vertices in different sections are not. The connected parts of a network are called components 11

Cohesion: Indicated by Component n n n “Singletons, ” who have no connections and

Cohesion: Indicated by Component n n n “Singletons, ” who have no connections and are least central The “giant component, ” which is the largest group of nodes tightly connected to the central nodes and to each other The “middle region, ” which represents isolated groups which interact amongst themselves but not with the rest of the network, forming isolated stars. Source: http: //boxesandarrows. com/socialnetworks-and-group-formation/ 12

Cohesion: Indicated by Component 13

Cohesion: Indicated by Component 13

Cohesion: Indicated by Component n A network is weakly connected – if all vertices

Cohesion: Indicated by Component n A network is weakly connected – if all vertices are connected by a semipath. n In a (weakly) connected network, we can “walk” from each vertex to all other vertices if we neglect the direction of the arcs. 14

Cohesion: Indicated by Component n In directed networks, there is a second type of

Cohesion: Indicated by Component n In directed networks, there is a second type of connectedness: a network is strongly connected if each pair of vertices is connected by a path. q In a strongly connected network, we can travel from each vertex to any other vertex obeying the direction of the arcs. 15

Cohesion: Indicated by Component n Strong connectedness is more restricted than weak connectedness: n

Cohesion: Indicated by Component n Strong connectedness is more restricted than weak connectedness: n Each strongly connected network is also weakly connected but a weakly connected network is not necessarily strongly connected. 16

Cohesion: Indicated by Component n Vertices v 1, v 3, v 4, and v

Cohesion: Indicated by Component n Vertices v 1, v 3, v 4, and v 5 constitute a (weak) component q because they are connected by semipaths and there is no other vertex in the network which is also connected to them by a semipath. 17

Cohesion: Indicated by Component n A (weak) component is a maximal (weakly) connected subnetwork.

Cohesion: Indicated by Component n A (weak) component is a maximal (weakly) connected subnetwork. n A strong component, which is a maximal strongly connected subnetwork. 18

Cohesion: Indicated by Component n The example network contains three strong components. q The

Cohesion: Indicated by Component n The example network contains three strong components. q The largest strong component is composed of vertices v 3, v 4, and v 5, which are connected by paths in both directions. 19

Cohesion: Indicated by Component n There are two strong components consisting of one vertex

Cohesion: Indicated by Component n There are two strong components consisting of one vertex each, namely vertex v 1 and v 2. q Vertex v 2 is isolated and there are only paths from vertex v 1 but no paths to this vertex, so it is not strongly connected to any other vertex. n It is asymmetrically linked to the larger strong component. 20

Cohesion: Indicated by Component n In an undirected network, lines have no direction q

Cohesion: Indicated by Component n In an undirected network, lines have no direction q Each semiwalk is also a walk and each semipath is also a path. n Components are isolated from one another, there are no lines between vertices of different components. q This is similar to weak components in directed networks. 21

Cohesion: Indicated by Component n Components can be split up further into denser parts

Cohesion: Indicated by Component n Components can be split up further into denser parts by considering the number of distinct, that is, noncrossing, paths or semipaths that connect the vertices. q q Within a weak component, one semipath between each pair of vertices suffices but there must be at least two different semipaths in a bi-component. k-connected components: maximal subnetworks in which each pair of vertices is connected by at least k distinct semipaths or paths. 22

Cohesion: Indicated by Core n The distribution of degree reveals local concentrations of ties

Cohesion: Indicated by Core n The distribution of degree reveals local concentrations of ties around individual vertices q n but it does not tell us whether vertices with a high degree are clustered or scattered all over the network. Using degree to identify clusters of vertices that are tightly connected q because each vertex has a particular minimum degree within the cluster. 23

Cohesion: Indicated by Core n Paying no attention to the degree of one vertex

Cohesion: Indicated by Core n Paying no attention to the degree of one vertex but to the degree of all vertices within a cluster. n These clusters are called k-cores q k indicates the minimum degree of each vertex within the core n Ex: 2 -core contains all vertices that are connected by degree two or more to other vertices within the core. 24

Cohesion: Indicated by Core n A k-core identifies relatively dense subnetworks q so they

Cohesion: Indicated by Core n A k-core identifies relatively dense subnetworks q so they help to find cohesive subgroups. 25

Cohesion: Indicated by Core n Undirected network: q the degree of a vertex is

Cohesion: Indicated by Core n Undirected network: q the degree of a vertex is equal to the number of its neighbors, n k-core contains the vertices that have at least k neighbors within the core 26

Cohesion: Indicated by Core n n n All vertices belong to the 1 -core,

Cohesion: Indicated by Core n n n All vertices belong to the 1 -core, which is drawn in black One vertex, v 5, has only one neighbor, so it is not part of the 2 -core Vertex v 6 has a degree of 2, so it does not belong to the 3 -core q k-cores are nested: a vertex in a 3 -core is also part of a 2 -core, but not all members of a 2 -core belong to a 3 -core. 27

Cohesion: Indicated by Core n Different cohesive subgroups within a k-core are usually connected

Cohesion: Indicated by Core n Different cohesive subgroups within a k-core are usually connected by vertices that belong to lower cores q q Vertex v 6, which is part of the 2 -core, connects the two segments of the 3 -core. Eliminating the vertices belonging to cores below the 3 -core, n Obtain a network consisting of two components, which identify the cohesive subgroups within the 3 -core. 28

Cohesion: Indicated by Core n How k-cores help to detect cohesive subgroups? q q

Cohesion: Indicated by Core n How k-cores help to detect cohesive subgroups? q q Removing the lowest k-cores from the network until the network breaks up into relatively dense components. Each component is considered to be a cohesive subgroup n because they have at least k neighbors within the component. 29

30

30

Selected Topics in Data Networking Explore Social Networks: Prestige and Ranking

Selected Topics in Data Networking Explore Social Networks: Prestige and Ranking

Introduction n n Prestige is conceptualized as a particular pattern of social ties. In

Introduction n n Prestige is conceptualized as a particular pattern of social ties. In directed networks, people who receive many positive choices are considered to be prestigious. Source: http: //pursuitist. com/lady-gaga-rules-twitter/ q If everybody likes to play with the most popular girl or boy in a group but he or she does not play with all of them. 32

Introduction n Popularity and Indegree: Prestige q When ties are associated to some positive

Introduction n Popularity and Indegree: Prestige q When ties are associated to some positive aspects such as friendship or collaboration, n n q indegree is often interpreted as a form of popularity, outdegree is interpreted as gregariousness. A prestigious art museum receives more attention from art critics than less prestigious ones. Source: http: //en. wikipedia. org/wiki/Centrality 33

Introduction n The simplest measure of structural prestige is called popularity and it is

Introduction n The simplest measure of structural prestige is called popularity and it is measured by the number of choices a vertex receives: its indegree n In undirected networks, we cannot measure prestige; instead, we use degree as a simple measure of centrality. 34

Introduction n We should note that indegree does reflect prestige if we transpose the

Introduction n We should note that indegree does reflect prestige if we transpose the arcs in such a network, that is, if we reverse the direction of arcs. q It is interesting to note that several structural properties of a network do not change when the arcs are transposed 35

Correlation n Structural prestige scores: Correlation coefficients range from -1 to 1 q q

Correlation n Structural prestige scores: Correlation coefficients range from -1 to 1 q q A positive coefficient indicates that a high score on one feature is associated with a high score on the other (e. g. , high structural prestige occurs in families with high social status). A negative coefficient points toward a negative or inverse relation: a high score on one characteristic combines with a low score on the other (e. g. , high structural prestige is found predominantly with low social status families). 36

Correlation n There is no correlation if the absolute value of the coefficient is

Correlation n There is no correlation if the absolute value of the coefficient is less than (+/-)0. 05. q q If the absolute value of a coefficient is between 0. 05 and 0. 25 (and from -0. 05 to-0. 25), association is weak: Positive, Negative Coefficients from 0. 25 to 0. 60 (and from − 0. 25 to − 0. 60) indicate moderate association: Positive and Negative Coefficients from 0. 60 to 1. 00 (or − 0. 60 to − 1. 00) is interpreted as strong association : Positive and Negative Coefficient of 1 or − 1 is said to display perfect association : Positive and Negative 37

Domains n Popularity is a very restricted measure of prestige because it takes only

Domains n Popularity is a very restricted measure of prestige because it takes only direct choices into account. n This is the input domain of an actor, which has been called the influence domain because structurally prestigious people are thought to influence people who regard them as their leaders. n The larger the input domain of a person, the higher his or her structural prestige. n The output domain is more likely to reflect prestige in the case of a relation such as “lend money to”. 38

Proximity Prestige n Limit the input domain to direct neighbors or to neighbors at

Proximity Prestige n Limit the input domain to direct neighbors or to neighbors at maximum distance two on the assumption that nominations by close neighbors are more important than nominations by distant neighbors. n An indirect choice contributes less to prestige if it is mediated by a longer chain of intermediaries. 39

Proximity Prestige n Proximity Prestige: This index of prestige considers all vertices within the

Proximity Prestige n Proximity Prestige: This index of prestige considers all vertices within the input domain of a vertex but it attaches more importance to a nomination if it is expressed by a closer neighbor. n A nomination by a close neighbor contributes more to the proximity prestige of an actor than a nomination by a distant neighbor, but many “distant nominations” may contribute as much as one “close nomination. ” 40

Proximity Prestige n To allow direct choices to contribute more to the prestige of

Proximity Prestige n To allow direct choices to contribute more to the prestige of a vertex than indirect choices, proximity prestige weights each choice by its path distance to the vertex. n A higher distance yields a lower contribution to the proximity prestige of a vertex, but each choice contributes something. 41

Proximity Prestige n n n A larger input domain (larger numerator) yields a higher

Proximity Prestige n n n A larger input domain (larger numerator) yields a higher proximity prestige because more vertices are choosing an actor directly or indirectly. A smaller average distance (smaller denominator) yields a higher proximity prestige score because there are more nominations by close neighbors. Maximum proximity prestige is achieved if a vertex is directly chosen by all other vertices. The proportion of vertices in the input domain is 1 and the mean distance from these vertices is 1, so proximity prestige is 1 divided by 1. Vertices without input domain get minimum proximity prestige by definition, which is zero. 42

Proximity Prestige n n n All vertices at the extremes of the network (v

Proximity Prestige n n n All vertices at the extremes of the network (v 2, v 4, v 5, v 6, and v 10) have empty input domains, hence they have a proximity score of zero. The input domain of vertex v 9 contains vertex v 10 only, so its input domain size is 1 out of 9 (. 11). Average distance within the input domain of vertex v 9 is one, so the proximity prestige of vertex 9 is. 11 divided by 1 = 0. 11. Vertex v 1 has a maximal input domain (9 out of 9 =1), Average distance is 2. 0, so proximity prestige amounts to 1. 00 divided by 2. 0, which is. 5. q Avg. dist. = (4+3+2+1+1+1+2+2+2)/9 = 2 43

References n Wouter de Nooy, Andrej Mrvar, and Vladimir Batagelj, Exploratory Social Network Analysis

References n Wouter de Nooy, Andrej Mrvar, and Vladimir Batagelj, Exploratory Social Network Analysis with Pajek, Cambridge 44