CMU SCS Graph and Tensor Mining for fun

  • Slides: 35
Download presentation
CMU SCS Graph and Tensor Mining for fun and profit Luna Dong, Christos Faloutsos

CMU SCS Graph and Tensor Mining for fun and profit Luna Dong, Christos Faloutsos Andrey Kan, Jun Ma, Subho Mukherjee

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs [break] • Part#2: Tensors

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs [break] • Part#2: Tensors • Conclusions KDD 2018 Dong+ 2

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1.

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1. 3: community detection – P 1. 4: fraud/anomaly detection ? • Outliers • Lock-step behavior – P 1. 5: belief propagation KDD 2018 Dong+ 3

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1.

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1. 3: community detection – P 1. 4: fraud/anomaly detection – P 1. 5: belief propagation d e s i rv e p -su un ? d e s i rv e p -su i m e s KDD 2018 Dong+ 4

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1.

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1. 3: community detection – P 1. 4: fraud/anomaly detection un d e s i rv e p -su ? • P 1. 4. 1. Outliers • P 1. 4. 2. Lock-step behavior – P 1. 5: belief propagation KDD 2018 Dong+ 5

CMU SCS ‘Recipe’ Structure: • Problem definition • Short answer/solution • LONG answer –

CMU SCS ‘Recipe’ Structure: • Problem definition • Short answer/solution • LONG answer – details • Conclusion/short-answer KDD 2018 Dong+ 6

CMU SCS Problem Given: KDD 2018 Find: 1) Outliers 2) Lock-step Dong+ 7

CMU SCS Problem Given: KDD 2018 Find: 1) Outliers 2) Lock-step Dong+ 7

CMU SCS Solution Given: KDD 2018 l l a Find: B d d 1)

CMU SCS Solution Given: KDD 2018 l l a Find: B d d 1) Outliers O 2) Lock-step SVD Dong+ 8

CMU SCS P 1. 4. 1. Outliers • Which node(s) are strange? – Q:

CMU SCS P 1. 4. 1. Outliers • Which node(s) are strange? – Q: How to start? KDD 2018 Dong+ 9

CMU SCS P 1. 4. 1. Outliers • Which node(s) are strange? – Q:

CMU SCS P 1. 4. 1. Outliers • Which node(s) are strange? – Q: How to start? – A 1: egonet; and extract node features KDD 2018 Dong+ 10

CMU SCS Ego-net Patterns: Which is strange? Oddball: Spotting anomalies in weighted graphs, Leman

CMU SCS Ego-net Patterns: Which is strange? Oddball: Spotting anomalies in weighted graphs, Leman KDD 2018 Dong+ 11 Akoglu, Mary Mc. Glohon, Christos Faloutsos, PAKDD 2010

CMU SCS Ego-net Patterns: Which is strange? telemarketer, port scanner, people adding friends indiscriminatively,

CMU SCS Ego-net Patterns: Which is strange? telemarketer, port scanner, people adding friends indiscriminatively, etc. Near-star Near-clique tightly connected people, terrorist groups? , discussion group, etc. Oddball: Spotting anomalies in weighted graphs Leman Akoglu, Mary Mc. Glohon, Dong+ Christos Faloutsos KDD 2018 PAKDD 2010 12

CMU SCS P 1. 4. 1. Outliers • Which node(s) are strange? – Q:

CMU SCS P 1. 4. 1. Outliers • Which node(s) are strange? – Q: How to start? – A: egonet; and extract node features – Q’: which features? – A’: ART! Infinite! Pick a few, e. g. : KDD 2018 Dong+ 13

CMU SCS Ego-net Patterns § Ni: number of neighbors (degree) of ego i §

CMU SCS Ego-net Patterns § Ni: number of neighbors (degree) of ego i § Ei: number of edges in egonet i § Wi: total weight of egonet i § λw, i: principal eigenvalue of the weighted adjacency matrix of egonet i Oddball: Spotting anomalies in weighted graphs, Leman Akoglu, Mary Mc. Glohon, Christos Faloutsos KDD 2018 Dong+ 14 Akoglu, Mary Mc. Glohon, Christos Faloutsos, PAKDD 2010

CMU SCS Pattern: Ego-net Power Law Density α Ei ∝ Ni 1≤α≤ 2 Enron

CMU SCS Pattern: Ego-net Power Law Density α Ei ∝ Ni 1≤α≤ 2 Enron CEO Oddball: Spotting anomalies in weighted graphs, Leman Akoglu, Mary Mc. Glohon, Christos Faloutsos KDD 2018 Dong+ 15 Akoglu, Mary Mc. Glohon, Christos Faloutsos, PAKDD 2010

CMU SCS Pattern: Ego-net Power Law Density Oddball: Spotting anomalies in weighted graphs, Leman

CMU SCS Pattern: Ego-net Power Law Density Oddball: Spotting anomalies in weighted graphs, Leman Akoglu, Mary Mc. Glohon, Christos Faloutsos KDD 2018 Dong+ 16 Akoglu, Mary Mc. Glohon, Christos Faloutsos, PAKDD 2010

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1.

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1. 3: community detection – P 1. 4: fraud/anomaly detection ? • Outliers • Lock-step behavior – P 1. 5: belief propagation KDD 2018 Dong+ 17

CMU SCS Problem Given: KDD 2018 Find: 1) Outliers 2) Lock-step Dong+ 18

CMU SCS Problem Given: KDD 2018 Find: 1) Outliers 2) Lock-step Dong+ 18

CMU SCS P 1. 4. 1. How to find ‘suspicious’ groups? • ‘blocks’ are

CMU SCS P 1. 4. 1. How to find ‘suspicious’ groups? • ‘blocks’ are normal, right? idols fans KDD 2018 Dong+ 19

CMU SCS P 1. 4. 1. How to find ‘suspicious’ groups? • ‘blocks’ are

CMU SCS P 1. 4. 1. How to find ‘suspicious’ groups? • ‘blocks’ are normal, right? idols fans KDD 2018 Dong+ 20

CMU SCS Except that: • ‘blocks’ are normal, right? • ‘hyperbolic’ communities are more

CMU SCS Except that: • ‘blocks’ are normal, right? • ‘hyperbolic’ communities are more realistic [Araujo+, PKDD’ 14] KDD 2018 Dong+ 21

CMU SCS Except that: • ‘blocks’ are usually suspicious • ‘hyperbolic’ communities are more

CMU SCS Except that: • ‘blocks’ are usually suspicious • ‘hyperbolic’ communities are more realistic [Araujo+, PKDD’ 14] Q: Can we spot blocks, easily? KDD 2018 Dong+ 22

CMU SCS Except that: • ‘blocks’ are usually suspicious • ‘hyperbolic’ communities are more

CMU SCS Except that: • ‘blocks’ are usually suspicious • ‘hyperbolic’ communities are more realistic [Araujo+, PKDD’ 14] Q: Can we spot blocks, easily? A: Silver bullet: SVD! KDD 2018 Dong+ 23

CMU SCS Why From: SALSA • HITS fixates on dense blocks (‘Tightly Knit Community’

CMU SCS Why From: SALSA • HITS fixates on dense blocks (‘Tightly Knit Community’ TKC - often link farms) Should win, but doesn’t under HITS KDD 2018 Dong+ 24

CMU SCS From : HIT Crush intro to SVD S • Recall: (SVD) matrix

CMU SCS From : HIT Crush intro to SVD S • Recall: (SVD) matrix factorization: finds blocks M idols N fans KDD 2018 ‘music lovers’ ‘sports lovers’ ‘citizens’ ‘singers’ ‘athletes’ ‘politicians’ + ~ Dong+ + 25

CMU SCS Crush intro to SVD • (SVD) matrix factorization: finds blocks A) Even

CMU SCS Crush intro to SVD • (SVD) matrix factorization: finds blocks A) Even if shuffled! M idols N fans KDD 2018 ‘music lovers’ ‘sports lovers’ ‘citizens’ ‘singers’ ‘athletes’ ‘politicians’ + ~ Dong+ + 26

CMU SCS Crush intro to SVD • (SVD) matrix factorization: finds blocks B) Even

CMU SCS Crush intro to SVD • (SVD) matrix factorization: finds blocks B) Even if ‘salt+pepper’ noise M idols N fans KDD 2018 ‘music lovers’ ‘sports lovers’ ‘citizens’ ‘singers’ ‘athletes’ ‘politicians’ + ~ Dong+ + 27

CMU SCS From : HIT Toy example – 5 blocks ‘idols’ v 0 u

CMU SCS From : HIT Toy example – 5 blocks ‘idols’ v 0 u 0 S v 1 u 1 ‘fans’ u 1 v 1 u 0 KDD 2018 Dong+ Eigen. Plots v 0 28

CMU SCS From : HIT Toy example – 5 blocks ‘idols’ v 0 u

CMU SCS From : HIT Toy example – 5 blocks ‘idols’ v 0 u 0 S v 1 u 1 ‘fans’ u 1 v 1 u 0 KDD 2018 Dong+ v 0 29

CMU SCS Inferring Strange Behavior from Connectivity Pattern in Social Networks PAKDD’ 14 Meng

CMU SCS Inferring Strange Behavior from Connectivity Pattern in Social Networks PAKDD’ 14 Meng Jiang, Peng Cui, Shiqiang Yang (Tsinghua) Alex Beutel, Christos Faloutsos (CMU)

CMU SCS Real Data • Spikes on the out-degree distribution KDD 2018 Dong+ 31

CMU SCS Real Data • Spikes on the out-degree distribution KDD 2018 Dong+ 31

CMU SCS MLG’ 18 Workshop 08/20/2018 (Tomorrow 4: 00 pm) ICC Capital Suit Room

CMU SCS MLG’ 18 Workshop 08/20/2018 (Tomorrow 4: 00 pm) ICC Capital Suit Room 8 Graph. RAD: A Graph-based Risky Account Detection System Jun Ma, Danqing Zhang, Yun Wang, Yan Zhang, Alexey Pozdnoukhov KDD 2018 Dong+ 32

CMU SCS - Input: Gigantic account link graph Community Detection Semi-supervised Suspicious Detection Output:

CMU SCS - Input: Gigantic account link graph Community Detection Semi-supervised Suspicious Detection Output: small candidate graphs for manual check KDD 2018 Dong+ 33

CMU SCS Solution Given: KDD 2018 l l a Find: B d d 1)

CMU SCS Solution Given: KDD 2018 l l a Find: B d d 1) Outliers O 2) Lock-step SVD Dong+ 34

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1.

CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1. 3: community detection – P 1. 4: fraud/anomaly detection – P 1. 5: belief propagation d e s i rv e p -su un ? d e s i rv e p -su i m e s KDD 2018 Dong+ 35