Association Rules with Graph Patterns Wenfei Fan Univ

  • Slides: 115
Download presentation
Association Rules with Graph Patterns Wenfei Fan Univ. of Edinburgh Xin Wang Beihang Univ.

Association Rules with Graph Patterns Wenfei Fan Univ. of Edinburgh Xin Wang Beihang Univ. Yinghui Wu Southwest Jiaotong Univ. Jingbo XU Washington State Univ. Tomer Baron 1

 • BACKGROUND AND MOTIVATION • ASSOCIATION VIA GRAPH PATTERNS • SUPPORT AND CONFIDNECE

• BACKGROUND AND MOTIVATION • ASSOCIATION VIA GRAPH PATTERNS • SUPPORT AND CONFIDNECE • DIVERSIFIED RULE DISCOVERY • IDENTIFING CUSTOMERS • EXPERIMENTAL STUDY 2015 , מרץ 16 3 הרצאה 2

Background and motivation: • Social Graph’s - "the global mapping of everybody and how

Background and motivation: • Social Graph’s - "the global mapping of everybody and how they're related". 3

Background and motivation: • Social Graph’s - "the global mapping of everybody and how

Background and motivation: • Social Graph’s - "the global mapping of everybody and how they're related". 4

Background and motivation: • Social Graph’s - "the global mapping of everybody and how

Background and motivation: • Social Graph’s - "the global mapping of everybody and how they're related". • Associations between entities in social graphs are useful in social marketing – “ 90% of customers trust peer recommendations versus 14% who trust advertising” 5

Background and motivation: Some differences between association rules on Item Sets to GPARs: •

Background and motivation: Some differences between association rules on Item Sets to GPARs: • Conventional support and confidence metrics no longer work for GPARs! • Mining algorithms for traditional rules and frequent graph patterns cannot be used to discover practical diversified GPARs. • When trying to identify potential customers in social graphs – it may be costly for example Facebook has 1. 3 billion nodes and 1 trillion links! Furthermore graph patterns matching by subgraph isomorphism is intractable. GPARs extended association rules from relations to graphs. 6

 • BACKROUND AND MOTIVATION • ASSOCIATION VIA GRAPH PATTERNS • SUPPORT AND CONFIDNECE

• BACKROUND AND MOTIVATION • ASSOCIATION VIA GRAPH PATTERNS • SUPPORT AND CONFIDNECE • DIVERSIFIED RULE DISCOVERY • IDENTIFING CUSTOMERS • EXPERIMENTAL STUDY 2015 , מרץ 16 3 הרצאה 7

Graphs: • 8

Graphs: • 8

Patterns: • 9

Patterns: • 9

Graph pattern matching: • 10

Graph pattern matching: • 10

Match: • 11

Match: • 11

Denotes: • 12

Denotes: • 12

Example: French restaurant y Cust X’ • Cust X city 13

Example: French restaurant y Cust X’ • Cust X city 13

Example: Friends - French restaurant y Cust X’ • Cust X city 14

Example: Friends - French restaurant y Cust X’ • Cust X city 14

Example: Friends Live_in - French restaurant y Cust X’ • Cust X city 15

Example: Friends Live_in - French restaurant y Cust X’ • Cust X city 15

Example: Friends Live_in - French restaurant y Cust X’ In like visit - •

Example: Friends Live_in - French restaurant y Cust X’ In like visit - • Cust X city 16

Example: Friends Live_in - French restaurant y Cust X’ In like visit - •

Example: Friends Live_in - French restaurant y Cust X’ In like visit - • Cust X city 17

Example: French restaurant Le Bernardin Cust 1 Cust 2 Friends Live_in - French restaurant

Example: French restaurant Le Bernardin Cust 1 Cust 2 Friends Live_in - French restaurant Patina French restaurant Per se Cust 3 Cust 4 In like visit - • French restaurant T Cust 5 Cust 6 18

Example for a match: Friends Live_in In like visit - • French restaurant Le

Example for a match: Friends Live_in In like visit - • French restaurant Le Bernardin Cust 1 Cust 2 19

Example for a match: Friends Live_in In like visit - • French restaurant Le

Example for a match: Friends Live_in In like visit - • French restaurant Le Bernardin Cust 1 Cust 2 20

Example for a match: Cust 2 Live_in In like visit - French restaurant y

Example for a match: Cust 2 Live_in In like visit - French restaurant y French restaurant Le Bernardin Cust 1 Friends - Cust X’ • Cust X city 21

Example for a match: Cust 2 Live_in In like visit - French restaurant y

Example for a match: Cust 2 Live_in In like visit - French restaurant y French restaurant Le Bernardin Cust 1 Friends - Cust X’ • Cust X city 22

French restaurant y Friends Live_in - Cust X’ Cust X In like visit -

French restaurant y Friends Live_in - Cust X’ Cust X In like visit - city French restaurant Le Bernardin Cust 1 Cust 2 French restaurant Per se French restaurant Patina Cust 3 Cust 4 T Cust 5 Cust 6 23

A few more definitions: • 24

A few more definitions: • 24

We now define graph-pattern association rules: • 25

We now define graph-pattern association rules: • 25

Back to the example: Friends Live_in In like visit - French restaurant y Cust

Back to the example: Friends Live_in In like visit - French restaurant y Cust X’ • Cust X city 27

 • BACKROUND AND MOTIVATION • ASSOCIATION VIA GRAPH PATTERNS • SUPPORT AND CONFIDNECE

• BACKROUND AND MOTIVATION • ASSOCIATION VIA GRAPH PATTERNS • SUPPORT AND CONFIDNECE • DIVERSIFIED RULE DISCOVERY • IDENTIFING CUSTOMERS • EXPERIMENTAL STUDY 2015 , מרץ 16 3 הרצאה 28

SUPPORT AND CONFIDENCE: • 29

SUPPORT AND CONFIDENCE: • 29

SUPPORT AND CONFIDENCE: • 30

SUPPORT AND CONFIDENCE: • 30

SUPPORT AND CONFIDENCE: • 31

SUPPORT AND CONFIDENCE: • 31

SUPPORT AND CONFIDENCE: • French restaurant Le Bernardin Cust 1 Cust 2 French restaurant

SUPPORT AND CONFIDENCE: • French restaurant Le Bernardin Cust 1 Cust 2 French restaurant Per se French restaurant Patina Cust 3 Cust 4 T Cust 5 Cust 6 32

SUPPORT AND CONFIDENCE: • French restaurant Le Bernardin Cust 1 Cust 2 French restaurant

SUPPORT AND CONFIDENCE: • French restaurant Le Bernardin Cust 1 Cust 2 French restaurant Per se French restaurant Patina Cust 3 Cust 4 T Cust 5 Cust 6 33

SUPPORT AND CONFIDENCE: • 34

SUPPORT AND CONFIDENCE: • 34

French restaurant y Cust X’ Friends Live_in In like visit - Cust X city

French restaurant y Cust X’ Friends Live_in In like visit - Cust X city French restaurant Le Bernardin Cust 1 Cust 2 French restaurant Per se French restaurant Patina Cust 3 Cust 4 T Cust 5 Cust 6 35

French restaurant y Cust X’ Friends Live_in In like visit - Cust X city

French restaurant y Cust X’ Friends Live_in In like visit - Cust X city French restaurant Le Bernardin Cust 1 Cust 2 French restaurant Per se French restaurant Patina Cust 3 Cust 4 T Cust 5 Cust 6 36

SUPPORT AND CONFIDENCE: • 37

SUPPORT AND CONFIDENCE: • 37

SUPPORT AND CONFIDENCE: Many Graphs are incomplete – thus not all required information appears

SUPPORT AND CONFIDENCE: Many Graphs are incomplete – thus not all required information appears – we would like confidence to consider it. 38

SUPPORT AND CONFIDENCE: • 39

SUPPORT AND CONFIDENCE: • 39

SUPPORT AND CONFIDENCE: • 40

SUPPORT AND CONFIDENCE: • 40

 • BACKROUND AND MOTIVATION • ASSOCIATION VIA GRAPH PATTERNS • SUPPORT AND CONFIDNECE

• BACKROUND AND MOTIVATION • ASSOCIATION VIA GRAPH PATTERNS • SUPPORT AND CONFIDNECE • DIVERSIFIED RULE DISCOVERY • IDENTIFING CUSTOMERS • EXPERIMENTAL STUDY 2015 , מרץ 16 3 הרצאה 41

The Diversified Mining Problem: • 42

The Diversified Mining Problem: • 42

The Diversified Mining Problem: • 43

The Diversified Mining Problem: • 43

The Diversified Mining Problem: • 44

The Diversified Mining Problem: • 44

The Diversified Mining Problem: • Let’s remember: 45

The Diversified Mining Problem: • Let’s remember: 45

The Diversified Mining Problem: • 46

The Diversified Mining Problem: • 46

Discovery algorithm: • 47

Discovery algorithm: • 47

Discovery algorithm: • 48

Discovery algorithm: • 48

Discovery algorithm: • 49

Discovery algorithm: • 49

Discovery algorithm: • 50

Discovery algorithm: • 50

Discovery algorithm: • 51

Discovery algorithm: • 51

Discovery algorithm: 52

Discovery algorithm: 52

Discovery algorithm: 53

Discovery algorithm: 53

Discovery algorithm: 54

Discovery algorithm: 54

Discovery algorithm: 55

Discovery algorithm: 55

Discovery algorithm: 56

Discovery algorithm: 56

Discovery algorithm: • 57

Discovery algorithm: • 57

Discovery algorithm: • 58

Discovery algorithm: • 58

Discovery algorithm: 59

Discovery algorithm: 59

Discovery algorithm: 60

Discovery algorithm: 60

Discovery algorithm: • 61

Discovery algorithm: • 61

Discovery algorithm: 62

Discovery algorithm: 62

Discovery algorithm: 63

Discovery algorithm: 63

Discovery algorithm: 64

Discovery algorithm: 64

Discovery algorithm: 65

Discovery algorithm: 65

Discovery algorithm: • 66

Discovery algorithm: • 66

Discovery algorithm: 67

Discovery algorithm: 67

Discovery algorithm: 68

Discovery algorithm: 68

Discovery algorithm: • 69

Discovery algorithm: • 69

Discovery algorithm: • 70

Discovery algorithm: • 70

Discovery algorithm: 71

Discovery algorithm: 71

Discovery algorithm: 72

Discovery algorithm: 72

Discovery algorithm: • 73

Discovery algorithm: • 73

Example : • French restaurant Le Bernardin Cust 1 Cust 2 French restaurant Per

Example : • French restaurant Le Bernardin Cust 1 Cust 2 French restaurant Per se French restaurant Patina Cust 3 Cust 4 T Cust 5 Cust 6 74

Example : • French restaurant y Cust X’ city Cust X French restaurant y

Example : • French restaurant y Cust X’ city Cust X French restaurant y Cust X’ Cust X city 75

Example : site message GPAR flag T T T M T 76

Example : site message GPAR flag T T T M T 76

Example : French restaurant y Cust X’ city French restaurant y Cust X’ Cust

Example : French restaurant y Cust X’ city French restaurant y Cust X’ Cust X city 77

Example : site message GPAR flag F F F 78

Example : site message GPAR flag F F F 78

Discovery algorithm: • 79

Discovery algorithm: • 79

Discovery algorithm: • 80

Discovery algorithm: • 80

Discovery algorithm: • 81

Discovery algorithm: • 81

Discovery algorithm: • 82

Discovery algorithm: • 82

Discovery algorithm: • 83

Discovery algorithm: • 83

Discovery algorithm: • 84

Discovery algorithm: • 84

Discovery algorithm: • 85

Discovery algorithm: • 85

Discovery algorithm: • 86

Discovery algorithm: • 86

Discovery algorithm: • 87

Discovery algorithm: • 87

 • BACKROUND AND MOTIVATION • ASSOCIATION VIA GRAPH PATTERNS • SUPPORT AND CONFIDNECE

• BACKROUND AND MOTIVATION • ASSOCIATION VIA GRAPH PATTERNS • SUPPORT AND CONFIDNECE • DIVERSIFIED RULE DISCOVERY • IDENTIFING CUSTOMERS • EXPERIMENTAL STUDY 2015 , מרץ 16 3 הרצאה 88

The Entity Identification Problem: • 89

The Entity Identification Problem: • 89

The Entity Identification Problem: • 90

The Entity Identification Problem: • 90

The Entity Identification Problem: • 91

The Entity Identification Problem: • 91

The Entity Identification Problem: • 92

The Entity Identification Problem: • 92

The Entity Identification Problem: • 93

The Entity Identification Problem: • 93

The Entity Identification Problem: • 94

The Entity Identification Problem: • 94

The Entity Identification Problem: • 95

The Entity Identification Problem: • 95

 • BACKROUND AND MOTIVATION • ASSOCIATION VIA GRAPH PATTERNS • SUPPORT AND CONFIDNECE

• BACKROUND AND MOTIVATION • ASSOCIATION VIA GRAPH PATTERNS • SUPPORT AND CONFIDNECE • DIVERSIFIED RULE DISCOVERY • IDENTIFING CUSTOMERS • EXPERIMENTAL STUDY 2015 , מרץ 16 3 הרצאה 96

EXPERMENTAL STUDY The writes used real-life and synthetic graphs. Three types of experiments: •

EXPERMENTAL STUDY The writes used real-life and synthetic graphs. Three types of experiments: • The scalability of DMine algorithm. • The effectiveness of DMine for discovering interesting GPARs. 97

EXPERMENTAL SETTINGS • 98

EXPERMENTAL SETTINGS • 98

EXPERMENTAL SETTINGS The implementation was in Java. Algorithm DMine was compared with: • DMine.

EXPERMENTAL SETTINGS The implementation was in Java. Algorithm DMine was compared with: • DMine. NO - its counterpart without optimization (incremental, reductions and bisimilarity checking). • GRAMI – an open source frequent subgraph mining tool, • Since GRAMI uses a single machine , it only compered the interestingness of patterns found by GRAMI with GPARs found by DMine. 99

EXPERMENTAL SETTINGS • 100

EXPERMENTAL SETTINGS • 100

EXPERMENTAL RESULTS • 101

EXPERMENTAL RESULTS • 101

 • 102

• 102

Exp-1 Scalability of DMine: DMine scales well with the increase of processors. The improvement

Exp-1 Scalability of DMine: DMine scales well with the increase of processors. The improvement is 3. 7 (resp 2. 69) times when n increases from 4 to 20. It is on average 1. 67 (resp. 1. 37) times faster than DMin. NO. Optimization strategies effectively reduce confidence checking time. 103

 • 104

• 104

 • 105

• 105

 • 106

• 106

 • 107

• 107

 • 108

• 108

 • No graph, the increase in d for both algorithms increases computation time,

• No graph, the increase in d for both algorithms increases computation time, DMine is less sensitive to the variation, but no comparison between the 2 algorithms. 109

EXPERMENTAL RESULTS Exp-2 Effectivness of DMine: GPARs discovered by DMine from Pokec and google+,

EXPERMENTAL RESULTS Exp-2 Effectivness of DMine: GPARs discovered by DMine from Pokec and google+, Support larger than 100: 110

EXPERMENTAL RESULTS Exp-2 Effectivness of DMine: GPARs discovered by DMine from Pokec and google+,

EXPERMENTAL RESULTS Exp-2 Effectivness of DMine: GPARs discovered by DMine from Pokec and google+, Support larger than 100: 111

EXPERMENTAL RESULTS • 112

EXPERMENTAL RESULTS • 112

EXPERMENTAL RESULTS • 113

EXPERMENTAL RESULTS • 113

EXPERMENTAL RESULTS 114

EXPERMENTAL RESULTS 114

THE END 115

THE END 115