Sim Rank A Measure of Structural Context Similarity

  • Slides: 21
Download presentation
Sim. Rank : A Measure of Structural. Context Similarity Advisor : Dr. Hsu Graduate

Sim. Rank : A Measure of Structural. Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom

Outline n n n n Motivation Objective Introduction Basic Graph Model Sim. Rank Random

Outline n n n n Motivation Objective Introduction Basic Graph Model Sim. Rank Random Surfer-Pairs Model Future Work Personal opinion

Motivation n The problem of measuring “similarity” of objects arises in many applications.

Motivation n The problem of measuring “similarity” of objects arises in many applications.

Objective n n The approach, applicable in any domain with object-to-object relationships. Two objects

Objective n n The approach, applicable in any domain with object-to-object relationships. Two objects are similar if they are related to similar objects.

Introduction

Introduction

Basic Graph Model n n We model objects and relationships as a directed graph

Basic Graph Model n n We model objects and relationships as a directed graph G=(V, E). For a node v in a graph, we denote by I(v) and O(v) the set of in-neighbors and out-neighbors.

Sim. Rank n Basic Sim. Rank Equation n If a=b then s(a, b) is

Sim. Rank n Basic Sim. Rank Equation n If a=b then s(a, b) is defined to be 1. Otherwise, (1) n n Where C is a constant between 0 and 1. Set s(a, b)=0 when or.

Sim. Rank n Bipartite Sim. Rank n n Two types of objects. Example :

Sim. Rank n Bipartite Sim. Rank n n Two types of objects. Example : Shopping graph G.

Sim. Rank

Sim. Rank

Sim. Rank n Let s(A, B) denote the similarity between persons A and B,

Sim. Rank n Let s(A, B) denote the similarity between persons A and B, for (2) n Let s(c, d) denote the similarity between items c and d, for (3)

Sim. Rank n Computing Sim. Rank-Naive Method n n is a lower bound on

Sim. Rank n Computing Sim. Rank-Naive Method n n is a lower bound on the To compute (if ) . from (4) For , and for .

Sim. Rank n n The space required is simply store the results. The time

Sim. Rank n n The space required is simply store the results. The time required is. n n to K: The number of iterations : The average of |I(a)||I(b)| over all node pairs (a, b).

Sim. Rank n Computing Sim. Rank-Pruning n n set the similarity between two nodes

Sim. Rank n Computing Sim. Rank-Pruning n n set the similarity between two nodes far apart to be 0. consider node-pairs only for nodes which are near each other.

Sim. Rank n n Radius r, and average such neighbors for a node, then

Sim. Rank n n Radius r, and average such neighbors for a node, then there will be node-pairs. The time and space complexities become and respectively.

Random Surfer-Pair Model n Expected Distance n n n Let H be any strongly

Random Surfer-Pair Model n Expected Distance n n n Let H be any strongly connected graph. Let u, v be any two nodes in H. We define the expected distance d(u, v) from u to v as (5)

Random Surfer-Pair Model n Expected Meeting Distance(EMD). (6)

Random Surfer-Pair Model n Expected Meeting Distance(EMD). (6)

Random Surfer-Pair Model n Expected-f Meeting Distance n n n To circumvent the “infinite

Random Surfer-Pair Model n Expected-f Meeting Distance n n n To circumvent the “infinite EMD” problem. To map all distances to a finite interval. Exponential function , where is a constant. (7)

Random Surfer-Pair Model n Equivalence to Sim. Rank

Random Surfer-Pair Model n Equivalence to Sim. Rank

Random Surfer-Pair Model n Theorem. n The Sim. Rank score, with parameter C, between

Random Surfer-Pair Model n Theorem. n The Sim. Rank score, with parameter C, between two nodes is their expected-f meeting distance traveling back-edges, for.

Future Work n Future Work. n Divided and conquer and merge. n n Divided

Future Work n Future Work. n Divided and conquer and merge. n n Divided a corpus into chunks… Ternary(or more) relationships.

Personal Opinion n We believe that the intuition behind Sim. Rank can be used

Personal Opinion n We believe that the intuition behind Sim. Rank can be used in many domains which based on objects to objects.