CMU SCS Graph and Tensor Mining for fun
- Slides: 35
CMU SCS Graph and Tensor Mining for fun and profit Luna Dong, Christos Faloutsos Andrey Kan, Jun Ma, Subho Mukherjee
CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs [break] • Part#2: Tensors • Conclusions KDD 2018 Dong+ 2
CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1. 3: community detection – P 1. 4: fraud/anomaly detection ? • Outliers • Lock-step behavior – P 1. 5: belief propagation KDD 2018 Dong+ 3
CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1. 3: community detection – P 1. 4: fraud/anomaly detection – P 1. 5: belief propagation d e s i rv e p -su un ? d e s i rv e p -su i m e s KDD 2018 Dong+ 4
CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1. 3: community detection – P 1. 4: fraud/anomaly detection un d e s i rv e p -su ? • P 1. 4. 1. Outliers • P 1. 4. 2. Lock-step behavior – P 1. 5: belief propagation KDD 2018 Dong+ 5
CMU SCS ‘Recipe’ Structure: • Problem definition • Short answer/solution • LONG answer – details • Conclusion/short-answer KDD 2018 Dong+ 6
CMU SCS Problem Given: KDD 2018 Find: 1) Outliers 2) Lock-step Dong+ 7
CMU SCS Solution Given: KDD 2018 l l a Find: B d d 1) Outliers O 2) Lock-step SVD Dong+ 8
CMU SCS P 1. 4. 1. Outliers • Which node(s) are strange? – Q: How to start? KDD 2018 Dong+ 9
CMU SCS P 1. 4. 1. Outliers • Which node(s) are strange? – Q: How to start? – A 1: egonet; and extract node features KDD 2018 Dong+ 10
CMU SCS Ego-net Patterns: Which is strange? Oddball: Spotting anomalies in weighted graphs, Leman KDD 2018 Dong+ 11 Akoglu, Mary Mc. Glohon, Christos Faloutsos, PAKDD 2010
CMU SCS Ego-net Patterns: Which is strange? telemarketer, port scanner, people adding friends indiscriminatively, etc. Near-star Near-clique tightly connected people, terrorist groups? , discussion group, etc. Oddball: Spotting anomalies in weighted graphs Leman Akoglu, Mary Mc. Glohon, Dong+ Christos Faloutsos KDD 2018 PAKDD 2010 12
CMU SCS P 1. 4. 1. Outliers • Which node(s) are strange? – Q: How to start? – A: egonet; and extract node features – Q’: which features? – A’: ART! Infinite! Pick a few, e. g. : KDD 2018 Dong+ 13
CMU SCS Ego-net Patterns § Ni: number of neighbors (degree) of ego i § Ei: number of edges in egonet i § Wi: total weight of egonet i § λw, i: principal eigenvalue of the weighted adjacency matrix of egonet i Oddball: Spotting anomalies in weighted graphs, Leman Akoglu, Mary Mc. Glohon, Christos Faloutsos KDD 2018 Dong+ 14 Akoglu, Mary Mc. Glohon, Christos Faloutsos, PAKDD 2010
CMU SCS Pattern: Ego-net Power Law Density α Ei ∝ Ni 1≤α≤ 2 Enron CEO Oddball: Spotting anomalies in weighted graphs, Leman Akoglu, Mary Mc. Glohon, Christos Faloutsos KDD 2018 Dong+ 15 Akoglu, Mary Mc. Glohon, Christos Faloutsos, PAKDD 2010
CMU SCS Pattern: Ego-net Power Law Density Oddball: Spotting anomalies in weighted graphs, Leman Akoglu, Mary Mc. Glohon, Christos Faloutsos KDD 2018 Dong+ 16 Akoglu, Mary Mc. Glohon, Christos Faloutsos, PAKDD 2010
CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1. 3: community detection – P 1. 4: fraud/anomaly detection ? • Outliers • Lock-step behavior – P 1. 5: belief propagation KDD 2018 Dong+ 17
CMU SCS Problem Given: KDD 2018 Find: 1) Outliers 2) Lock-step Dong+ 18
CMU SCS P 1. 4. 1. How to find ‘suspicious’ groups? • ‘blocks’ are normal, right? idols fans KDD 2018 Dong+ 19
CMU SCS P 1. 4. 1. How to find ‘suspicious’ groups? • ‘blocks’ are normal, right? idols fans KDD 2018 Dong+ 20
CMU SCS Except that: • ‘blocks’ are normal, right? • ‘hyperbolic’ communities are more realistic [Araujo+, PKDD’ 14] KDD 2018 Dong+ 21
CMU SCS Except that: • ‘blocks’ are usually suspicious • ‘hyperbolic’ communities are more realistic [Araujo+, PKDD’ 14] Q: Can we spot blocks, easily? KDD 2018 Dong+ 22
CMU SCS Except that: • ‘blocks’ are usually suspicious • ‘hyperbolic’ communities are more realistic [Araujo+, PKDD’ 14] Q: Can we spot blocks, easily? A: Silver bullet: SVD! KDD 2018 Dong+ 23
CMU SCS Why From: SALSA • HITS fixates on dense blocks (‘Tightly Knit Community’ TKC - often link farms) Should win, but doesn’t under HITS KDD 2018 Dong+ 24
CMU SCS From : HIT Crush intro to SVD S • Recall: (SVD) matrix factorization: finds blocks M idols N fans KDD 2018 ‘music lovers’ ‘sports lovers’ ‘citizens’ ‘singers’ ‘athletes’ ‘politicians’ + ~ Dong+ + 25
CMU SCS Crush intro to SVD • (SVD) matrix factorization: finds blocks A) Even if shuffled! M idols N fans KDD 2018 ‘music lovers’ ‘sports lovers’ ‘citizens’ ‘singers’ ‘athletes’ ‘politicians’ + ~ Dong+ + 26
CMU SCS Crush intro to SVD • (SVD) matrix factorization: finds blocks B) Even if ‘salt+pepper’ noise M idols N fans KDD 2018 ‘music lovers’ ‘sports lovers’ ‘citizens’ ‘singers’ ‘athletes’ ‘politicians’ + ~ Dong+ + 27
CMU SCS From : HIT Toy example – 5 blocks ‘idols’ v 0 u 0 S v 1 u 1 ‘fans’ u 1 v 1 u 0 KDD 2018 Dong+ Eigen. Plots v 0 28
CMU SCS From : HIT Toy example – 5 blocks ‘idols’ v 0 u 0 S v 1 u 1 ‘fans’ u 1 v 1 u 0 KDD 2018 Dong+ v 0 29
CMU SCS Inferring Strange Behavior from Connectivity Pattern in Social Networks PAKDD’ 14 Meng Jiang, Peng Cui, Shiqiang Yang (Tsinghua) Alex Beutel, Christos Faloutsos (CMU)
CMU SCS Real Data • Spikes on the out-degree distribution KDD 2018 Dong+ 31
CMU SCS MLG’ 18 Workshop 08/20/2018 (Tomorrow 4: 00 pm) ICC Capital Suit Room 8 Graph. RAD: A Graph-based Risky Account Detection System Jun Ma, Danqing Zhang, Yun Wang, Yan Zhang, Alexey Pozdnoukhov KDD 2018 Dong+ 32
CMU SCS - Input: Gigantic account link graph Community Detection Semi-supervised Suspicious Detection Output: small candidate graphs for manual check KDD 2018 Dong+ 33
CMU SCS Solution Given: KDD 2018 l l a Find: B d d 1) Outliers O 2) Lock-step SVD Dong+ 34
CMU SCS Roadmap • Introduction – Motivation • Part#1: Graphs –… – P 1. 3: community detection – P 1. 4: fraud/anomaly detection – P 1. 5: belief propagation d e s i rv e p -su un ? d e s i rv e p -su i m e s KDD 2018 Dong+ 35
- Data mining cmu
- Cmu data mining
- Difference between strip mining and open pit mining
- Difference between text mining and web mining
- Strip mining vs open pit mining
- Mineral resources and mining chapter 13
- Mining multimedia databases in data mining
- Mining complex types of data
- Cmu graph theory
- Desco industries rochester nh
- Applied hydrology
- Lluvia neta
- Tabel panjang lengkung peralihan
- Infiltration index
- Simbol komponen diac
- Scs curve number
- Circuitos com scr
- Wiki.scs
- Scs.ryerson.ca harley
- Rangkaian fet
- Scs reasonable person principle
- Scs thyristor
- Scs carleton
- Scs archiver
- Jenis lengkung
- Scs elogs
- Scs lulu
- Scs methode
- Doc scs
- Scs scanner
- Arabesque: a system for distributed graph mining
- How should mining graph look like
- Traction vectors
- Wait-for graph
- Kontinuitetshantering
- Typiska drag för en novell