Paper ID 1948 Struct Inf Mining Structural Influence

  • Slides: 1
Download presentation
Paper. ID: 1948 Struct. Inf: Mining Structural Influence from Social Streams Jing Zhang, Jie

Paper. ID: 1948 Struct. Inf: Mining Structural Influence from Social Streams Jing Zhang, Jie Tang, Yuanyi Zhong, Yuchen Mo, Juanzi Li, Guojie Song, Wendy Hall, and Jimeng Sun Department of Computer Science, Tsinghua University Active neighbor Target node Question: In which structures, the target nodes are most likely to be activated? Problem Formulation Input: (v 1, a 1, t 1) v 1 v 0 (v 1, a 1, t 1) (v 0, a 1, t 2) v 3 v 2 (v 2, a 2, t 4) (v 3, a 1, t 2) (v 0, a 1, t 2) (v 3, a 1, t 2) (v 2, a 1, t 3) v 5 (v 3, a 2, t 3) (v 2, a 1, t 3) (v 2, a 2, t 4) … Network (v 3, a 2, t 3) Streaming actions Active actions (v 5, a 2, t 5) Inactive actions Action diffusion graph Output: Structural influence: Influence Probabilities of a structure Ck that can be found in the action diffusion graph. Structural Influence Measurement Basic method Sampling based methods • Maintain a queue and a map to record the diffusion edges within recent time interval. • To calculate xk, active actions are newly arrived actions. • To calculate yk, inactive actions are actions that are outdated. • Enumerate structures by extending neighboring actions of active or inactive actions. • To avoid duplicate enumeration, assign each action an incremental (unique) label when it arrives, and make the labels of the selected actions smaller than those in the candidate actions. • Sampling 1 – Randomly sample nodes when enumerating influence patterns. • Sampling 2 – Randomly reserve edges when building diffusion graph. • Sampling 3 – Combine Sampling 1 and Sampling 2. They are unbiased sampling methods Results Trade-off between error and time by varying sampling probabilities: Sampling 3 is most insensitive to parameters Convergence of relative error: : Approximate influence probability : Relative error of approximate values C 1 C 4 Retweet prediction: Basic: #friends, gender, status, etc. C 1 : the number of active neighbors Weak: Moderate: Strong: