CMU SCS Graph and Tensor Mining for fun

  • Slides: 40
Download presentation
CMU SCS Graph and Tensor Mining for fun and profit Luna Dong, Christos Faloutsos

CMU SCS Graph and Tensor Mining for fun and profit Luna Dong, Christos Faloutsos Andrey Kan, Jun Ma, Subho Mukherjee

CMU SCS Roadmap • • Introduction – Motivation Part#1: Graphs Part#2: Tensors and Knowledge

CMU SCS Roadmap • • Introduction – Motivation Part#1: Graphs Part#2: Tensors and Knowledge Bases Conclusions – Future research KDD, 2018 Dong+ 2

CMU SCS Over-arching conclusion • MANY, time-tested, algorithms for graph & tensor mining •

CMU SCS Over-arching conclusion • MANY, time-tested, algorithms for graph & tensor mining • (more, are needed) directed … dates … Acted_in born_in … KDD, 2018 Dong+ produced 3

CMU SCS Over-arching conclusion le dates … m s directed ob … Acted_in Pr

CMU SCS Over-arching conclusion le dates … m s directed ob … Acted_in Pr born_in … produced (s om e) so lu tio ns = + + KDD, 2018 Dong+ 4

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs ? ? – P

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs ? ? – P 1. 1: properties/patterns in graphs – P 1. 2: node importance – P 1. 3: community detection – P 1. 4: fraud/anomaly detection – P 1. 5: belief propagation KDD, 2018 Dong+ 5

CMU SCS ? ? Problem definition • Are real graphs random? – S*: what

CMU SCS ? ? Problem definition • Are real graphs random? – S*: what do static graphs look like? – T*: how do graphs evolve over time? KDD, 2018 Dong+ 6

CMU SCS ? ? Short answer(s) • Are real graphs random? – S*: what

CMU SCS ? ? Short answer(s) • Are real graphs random? – S*: what do static graphs look like? • • • S. 0: ‘six degrees’ S. 1: skewed degree distribution S. 2: skewed eigenvalues S. 3: triangle power-laws S. 4: GCC; and skewed distr. of conn. comp. – T*: how do graphs evolve over time? • T. 1: diameters • T. 2: densification KDD, 2018 Dong+ 7

CMU SCS ? ? Short answer(s) • Are real graphs random? a x ~

CMU SCS ? ? Short answer(s) • Are real graphs random? a x ~ : y – S*: what do static graphs look like? s w a l r e y (log scale) • S. 0: ‘six degrees’ • S. 1: skewed degree distribution s n a i s • S. 2: u skewed eigenvalues s a G T NO • S. 3: triangle power-laws • S. 4: GCC; and skewed distr. of conn. comp. w o P s m rith a g o l e k Ta a x (log scale) – T*: how do graphs evolve over time? • T. 1: diameters • T. 2: densification KDD, 2018 Dong+ 8

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs – P 1. 1:

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs – P 1. 1: properties/patterns in graphs – P 1. 2: node importance – P 1. 3: community detection – P 1. 4: fraud/anomaly detection – P 1. 5: belief propagation KDD, 2018 Dong+ ? 9

CMU SCS Node importance - Motivation: • Given a graph (eg. , web pages

CMU SCS Node importance - Motivation: • Given a graph (eg. , web pages containing the desirable query word) • Q: Which node is the most important? KDD, 2018 Dong+ P 1 -10

CMU SCS Node importance - Motivation: • Given a graph (eg. , web pages

CMU SCS Node importance - Motivation: • Given a graph (eg. , web pages containing the desirable query word) • Q: Which node is the most important? • A 1: Page. Rank (PR) • A 2: HITS • A 3: SALSA KDD, 2018 Dong+ P 1 -11

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs – P 1. 1:

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs – P 1. 1: properties/patterns in graphs – P 1. 2: node importance – P 1. 3: community detection – P 1. 4: fraud/anomaly detection – P 1. 5: belief propagation KDD, 2018 Dong+ ? 12

CMU SCS Problem Definition • Given a graph, and k • Break it into

CMU SCS Problem Definition • Given a graph, and k • Break it into k (disjoint) communities KDD, 2018 Dong+ 13

CMU SCS Short answer • METIS [Karypis, Kumar] • (but: maybe NO good cuts

CMU SCS Short answer • METIS [Karypis, Kumar] • (but: maybe NO good cuts exist!) KDD, 2018 Dong+ 14

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs – P 1. 1:

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs – P 1. 1: properties/patterns in graphs – P 1. 2: node importance – P 1. 3: community detection – P 1. 4: fraud/anomaly detection – P 1. 5: belief propagation KDD, 2018 Dong+ ? 15

CMU SCS Problem Given: KDD, 2018 Find: 1) Outliers 2) Lock-step Dong+ 16

CMU SCS Problem Given: KDD, 2018 Find: 1) Outliers 2) Lock-step Dong+ 16

CMU SCS Solution Given: KDD, 2018 l l a Find: B d d 1)

CMU SCS Solution Given: KDD, 2018 l l a Find: B d d 1) Outliers O 2) Lock-step SVD Dong+ 17

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs – P 1. 1:

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs – P 1. 1: properties/patterns in graphs – P 1. 2: node importance – P 1. 3: community detection – P 1. 4: fraud/anomaly detection – P 1. 5: belief propagation KDD, 2018 Dong+ 18

CMU SCS Problem • What color, for the rest? – Given homophily (/heterophily etc)?

CMU SCS Problem • What color, for the rest? – Given homophily (/heterophily etc)? KDD, 2018 Dong+ 19

CMU SCS Short answer: • What color, for the rest? • A: Belief Propagation

CMU SCS Short answer: • What color, for the rest? • A: Belief Propagation (‘zoo. BP’) www. cs. cmu. edu/~deswaran/code/zoobp. zip KDD, 2018 Dong+ 20

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs • Part#2: Tensors –

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs • Part#2: Tensors – P 2. 1: Basics (dfn, PARAFAC, HAR) – P 2. 2: Embeddings & mining – P 2. 3: Inference • Conclusions KDD, 2018 Dong+ 21

CMU SCS Problem dfn • What is ‘normal’? suspicious? Groups? 3 am, 4/1 …

CMU SCS Problem dfn • What is ‘normal’? suspicious? Groups? 3 am, 4/1 … 10 pm, 4/3 11 pm, 4/3 KDD, 2018 Dong+ 22

CMU SCS Recipe(s) • What is ‘normal’? suspicious? Groups? • A: tensor factorization (~

CMU SCS Recipe(s) • What is ‘normal’? suspicious? Groups? • A: tensor factorization (~ high-mode SVD) – PARAFAC (shown here) – Tucker Apple fans es tim … 10 pm, 4/3 11 pm, 4/3 KDD, 2018 jewelry ta … 3 am, 4/1 shoes m p 3 am, 4/1 customer = + + product Dong+ 23

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs • Part#2: Tensors –

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs • Part#2: Tensors – P 2. 1: Basics (dfn, PARAFAC) – P 2. 2: Embeddings & mining – P 2. 3: Inference • Conclusions KDD, 2018 Dong+ 24

CMU SCS P 2. 2: Problem: embedding • Given entities & predicates, find mappings

CMU SCS P 2. 2: Problem: embedding • Given entities & predicates, find mappings locations directed … dates … Acted_in artists born_in produced relations+ directed dates KDD, 2018 Dong+ Acted_in Artistic-pred. 25

CMU SCS Short answer • S 1. Two potential relationships among sub (h), pred

CMU SCS Short answer • S 1. Two potential relationships among sub (h), pred (r), and obj (t)? – Addition: h + r =? = t – Multiplication: h �r =? = t s t l u s re t r a • S 2. Two loss function options e h t l e f d o o – Closed-world assumption m e n t n o i o t a i t p a m S–t. Open-world c i ipl assumption ssu t l a u d l M r o • w n e p • O KDD, 2018 Dong+ 26

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs • Part#2: Tensors –

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs • Part#2: Tensors – P 2. 1: Basics (dfn, PARAFAC) – P 2. 2: Embeddings & mining – P 2. 3: Inference • Conclusions KDD, 2018 Dong+ 27

CMU SCS Problem Definition • Given existing triples • Q: Is a given triple

CMU SCS Problem Definition • Given existing triples • Q: Is a given triple correct? KDD, 2018 Dong+ 28

CMU SCS Conclusion/Short answer • Infer from other connecting paths Path 2 Path 1

CMU SCS Conclusion/Short answer • Infer from other connecting paths Path 2 Path 1 Prec 0. 03 Rec 0. 01 Rec 0. 33 F 1 0. 04 Weight 2. 62 Weight 2. 19 KDD, 2018 Dong+ 29

CMU SCS Roadmap • • Introduction – Motivation Part#1: Graphs Part#2: Tensors and Knowledge

CMU SCS Roadmap • • Introduction – Motivation Part#1: Graphs Part#2: Tensors and Knowledge Bases Conclusions – Future directions – F 1: forecasting – F 2: unifying factual info and behavior info KDD, 2018 Dong+ 30

CMU SCS Future work – F 1 • Forecasting: – Who will buy what,

CMU SCS Future work – F 1 • Forecasting: – Who will buy what, and when? … All 3 am, of 4/1 us every single year, we're a different person. 3 am, 4/1 … 10 pm, 4/3 11 pm, 4/3 KDD, 2018 Dong+ 31

CMU SCS Future work – F 2 • Customer info + behavior + Product

CMU SCS Future work – F 2 • Customer info + behavior + Product Graph 3 am, 4/1 … … 3 am, 4/1 directed dates 10 pm, 4/3 Dong+ … born_in 11 pm, 4/3 KDD, 2018 … … Acted_in produced 32

CMU SCS Future work – F 2 • Customer info + behavior + Product

CMU SCS Future work – F 2 • Customer info + behavior + Product Graph 3 am, 4/1 … … 3 am, 4/1 directed dates 10 pm, 4/3 Dong+ … born_in 11 pm, 4/3 KDD, 2018 … … Acted_in produced 33

CMU SCS Thanks to KDD, 2018 Danai Koutra U. Michigan Namyong Park CMU Dhivya

CMU SCS Thanks to KDD, 2018 Danai Koutra U. Michigan Namyong Park CMU Dhivya Eswaran CMU Hyun Ah Song CMU Vagelis Papalexakis UCR Rakshit Trivedi George Tech Dong+ 34

CMU SCS P 1 – Graphs - More references Danai Koutra and Christos Faloutsos,

CMU SCS P 1 – Graphs - More references Danai Koutra and Christos Faloutsos, Individual and Collective Graph Mining: Principles, Algorithms, and Applications October 2017, Morgan Claypool KDD, 2018 Dong+ 35

CMU SCS P 1 – Graphs - More references Deepayan Chakrabarti and Christos Faloutsos,

CMU SCS P 1 – Graphs - More references Deepayan Chakrabarti and Christos Faloutsos, Graph Mining: Laws, Tools, and Case Studies Oct. 2012, Morgan Claypool. KDD, 2018 Dong+ 36

CMU SCS P 1 – Graphs - More references Anomaly detection • Leman Akoglu,

CMU SCS P 1 – Graphs - More references Anomaly detection • Leman Akoglu, Hanghang Tong, & Danai Koutra, Graph based anomaly detection and description: a survey Data Mining and Knowledge Discovery (2015) 29: 626. • Arxiv version: https: //arxiv. org/abs/1404. 4679 KDD, 2018 Dong+ 37

CMU SCS P 2 – Tensors - More references Tensor survey • Tamara G.

CMU SCS P 2 – Tensors - More references Tensor survey • Tamara G. Kolda and Brett W. Bader Tensor Decompositions and Applications SIAM Rev. , 51(3), pp 455– 500, 2009. KDD, 2018 Dong+ 38

CMU SCS P 2 – Tensors - More references Tensor survey #2 • Nicholas

CMU SCS P 2 – Tensors - More references Tensor survey #2 • Nicholas D. Sidiropoulos, Lieven De Lathauwer, , Xiao Fu, , Kejun Huang, Evangelos E. Papalexakis, and Christos Faloutsos, Tensor Decomposition for Signal Processing and Machine Learning, IEEE TSP, 65(13), July 1, 2017 KDD, 2018 Dong+ 39

CMU SCS THANK YOU! le dates … m s directed ob Pr born_in …

CMU SCS THANK YOU! le dates … m s directed ob Pr born_in … produced (s om e) so lu tio ns = + + KDD, 2018 Dong+ 40 … Acted_in