Fast Random Walk with Restart and Its Applications
- Slides: 46
Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec. 18 -22, Hong. Kong
Motivating Questions • Q: How to measure the relevance? • A: Random walk with restart • Q: How to do it efficiently? • A: This talk tries to answer! 2
Random walk with restart 9 10 12 2 8 1 11 3 4 6 5 7 3
Random walk with restart 0. 04 9 0. 10 2 0. 13 1 3 8 0. 13 0. 03 10 12 0. 08 11 0. 04 4 0. 13 6 5 7 Node 4 0. 05 Nearby nodes, higher scores More red, more relevant 0. 02 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 0. 13 0. 10 0. 13 0. 22 0. 13 0. 05 0. 08 0. 04 0. 03 0. 04 0. 02 Ranking vector 4
Automatic Image Caption • Q … { Sea Sun Sky Wave} { Cat Forest Grass Tiger } ? A: RWR! {? , ? , } [Pan KDD 2004] 5
Region Image Test Image Sea Sun Sky Wave Cat Forest Keyword Tiger Grass 6
Region Image Test Image {Grass, Forest, Cat, Tiger} Sea Sun Sky Wave Cat Keyword Forest Tiger Grass 7
Neighborhood Formulation … … Q: what is most related conference to ICDM A: RWR! [Sun ICDM 2005] … … Conference Author 8
NF: example 9
Center-Piece Subgraph(Ce. PS) Q ? Original Graph Black: query nodes A: RWR! [Tong KDD 2006] Ce. PS 10
Ce. PS: Example 11
Other Applications • Content-based Image Retrieval [He] • Personalized Page. Rank [Jeh], [Widom], [Haveliwala] • Anomaly Detection (for node; link) [Sun] • Link Prediction [Getoor], [Jensen] • Semi-supervised Learning [Zhu], [Zhou] • … 12
Roadmap • Background – RWR: Definitions – RWR: Algorithms • Basic Idea • Fast. RWR – Pre-Compute Stage – On-Line Stage • Experimental Results • Conclusion 13
Computing RWR Ranking vector Adjacent matrix Restart p Starting vector 9 1 2 1 8 3 10 12 11 4 5 nx 1 nxn 6 7 nx 1 14
Beyond RWR : Maxwell Equation for Web! [Chakrabarti] SM Learning RL in CBIR [Zhou, Zhu] [He] P-Page. Rank [Haveliwala] RWR Page. Rank [Pan, Sun] [Haveliwala] Fast RWR Finds the Root Solution ! 15
• Q: Given query i, how to solve it? ? ? 16
Onthe. Fly: 0. 04 0. 10 10 0. 03 9 10 9 12 12 0. 08 0. 02 88 11 11 22 1 1 3 30. 13 44 5 5 660. 05 0. 13 77 0. 13 0. 04 0. 05 No pre-computation/ light storage Slow on-line response O(m. E) 17
Pre. Compute 0. 04 0. 13 11 R: 99 10 10 0. 03 1212 0. 08 88 0. 02 11 11 0. 10 22 44 3 0. 13 5 0. 13 0. 04 66 77 0. 05 [Haveliwala] 18
Pre. Compute: 0. 04 0. 13 11 99 10 10 0. 03 1212 0. 08 88 0. 02 11 11 0. 10 22 44 3 0. 13 5 0. 13 0. 04 66 77 0. 05 Fast on-line response Heavy pre-computation/storage cost O(n 3 ) O(n 2 ) 19
Q: How to Balance? Off-line On-line 20
Roadmap • Background – RWR: Definitions – RWR: Algorithms • Basic Idea • Fast. RWR – Pre-Compute Stage – On-Line Stage • Experimental Results • Conclusion 21
Basic Idea Find Community 2 1 4 9 2 1 8 3 9 8 3 12 11 6 5 10 10 0. 04 7 12 0. 13 1 11 0. 10 2 4 4 5 9 3 0. 08 0. 13 7 1 2 4 Fix the remaining 9 8 3 5 10 12 8 0. 13 5 6 10 11 0. 03 12 0. 04 6 7 0. 05 Combine 11 6 7 22
Pre-computational stage • Q: Efficiently compute and store Q-1 • A: A few small, instead of ONE BIG, matrices inversions 23
On-Line Query Stage • Q: Efficiently recover one column of Q-1 • A: A few, instead of MANY, matrix-vector multiplication + 24
Roadmap • Background – RWR: Definitions – RWR: Algorithms • Basic Idea • Fast. RWR – Pre-Compute Stage – On-Line Stage • Experimental Results • Conclusion 25
Pre-compute Stage • p 1: B_Lin Decomposition – P 1. 1 partition – P 1. 2 low-rank approximation • p 2: Q matrices – P 2. 1 computing – P 2. 2 computing (for each partition) (for concept space) 26
P 1. 1: partition 9 2 1 8 3 10 12 11 4 5 6 7 Within-partition links cross-partition links 27
P 1. 1: 9 2 1 8 3 block-diagonal 10 12 11 4 5 6 7 28
P 1. 2: LRA for 9 2 1 8 3 10 12 11 4 5 6 7 ~ |S| << |W 2| 29
p 2. 1 Computing 31
Comparing and • Computing Time – 100, 000 nodes; 100 partitions – Computing 100, 00 x is Faster! • Storage Cost – 100 x saving! Q 1, 1 = Q 2 1, Q 1, k 32
~ ~ ~ + + ? • Q: How to fix the green portions? 33
p 2. 2 Computing: Q 1, 1 _ = -1 V Q 2 1, U Q 1, k 9 1 2 8 3 10 12 11 4 5 6 7 34
We have: Communities Bridges SM Lemma says: 35
Roadmap • Background – RWR: Definitions – RWR: Algorithms • Basic Idea • Fast. RWR – Pre-Compute Stage – On-Line Stage • Experimental Results • Conclusion 36
On-Line Stage • Q + Pre-Computation Query ? Result • A (SM lemma) 37
On-Line Query Stage q 1: q 2: q 3: q 4: q 5: q 6: 38
39
Roadmap • Background – RWR: Definitions – RWR: Algorithms • Basic Idea • Fast. RWR – Pre-Compute Stage – On-Line Stage • Experimental Results • Conclusion 40
Experimental Setup • Dataset – DBLP/authorship – Author-Paper – 315 k nodes – 1, 800 k edges • Approx. Quality: Relative Accuracy • Application: Center-Piece Subgraph 41
Query Time vs. Pre-Compute Time Log Query Time • Quality: 90%+ • On-line: • Up to 150 x speedup • Pre-computation: • Two orders saving Log Pre-compute Time 42
Query Time vs. Pre-Storage Log Query Time • Quality: 90%+ • On-line: • Up to 150 x speedup • Pre-storage: • Three orders saving Log Storage 43
Roadmap • Background – RWR: Definitions – RWR: Algorithms • Basic Idea • Fast. RWR – Pre-Compute Stage – On-Line Stage • Experimental Results • Conclusion 44
Conclusion • Fast. RWR – Reasonable quality preservation (90%+) – 150 x speed-up: query time – Orders of magnitude saving: pre-compute & storage • More in the paper – The variant of Fast. RWR and theoretic justification – Implementation details • normalization, low-rank approximation, sparse – More experiments • Other datasets, other applications 45
Q&A Thank you! htong@cs. cmu. edu www. cs. cmu. edu/~htong 46
- Fast random walk with restart and its applications
- Techniques for reducing cache misses
- Acid fast and non acid fast bacteria
- Non acid fast bacteria
- Oilfield hos rules
- Windows vista system requirements
- Tlp restart
- Restart sql sbs monitoring
- Time to restart
- Iptables command
- Econometrics basic concepts
- Drunken sailor problem
- Lognormal random walk
- Random walk with drift
- Random walk problem
- Scaled random walk
- Page rank
- Random walk econometria
- Efficient market hypothesis
- Random sample generator excel
- Quantum walk applications
- Random assignment vs random sampling
- Random assignment vs random selection
- Fast timing applications
- Kav chromatography
- Cro stands for electronics
- The fourier transform and its applications
- Spectral graph theory and its applications
- Spectral graph theory course
- Linear algebra and its applications
- The resistance r experienced by a partially submerged body
- Slater's rule
- What is computer network and its applications
- Network architecture client server
- Transport number
- Zener diodes applications
- Form of emigree
- Its halloween its halloween the moon is full and bright
- When a train increases its velocity its momentum
- Sunny cloudy rainy windy
- If its a square it's a sonnet summary
- Its not easy but its worth it
- Walk two moons questions and answers
- Walk two moons chapter questions and answers
- What does the bird eat in a bird came down the walk
- Turn left go ahead
- A long walk to forever