On Computing Topt Most Influential Spatial Sites Tian
- Slides: 43
On Computing Top-t Most Influential Spatial Sites Tian Xia, Donghui Zhang, Evangelos Kanoulas, Yang Du Northeastern University Boston, USA 9/2/2005 VLDB 2005, Trondheim, Norway 1
Outline Problem Definition p Related Work p The New Metric: min. Exist. DNN p Data Structures and Algorithm p Experimental Results p Conclusions p 9/2/2005 VLDB 2005, Trondheim, Norway 2
Problem Definition p Given: n n p Top-t most influential sites query: n p a set of sites S a set of weighted objects O a spatial region Q an integer t. find t sites in Q with the largest influences. influence of a site s = total weight of objects that consider s as the nearest site. 9/2/2005 VLDB 2005, Trondheim, Norway 3
Motivation p Which supermarket in Boston is the most influential among residential buildings? n n p Sites: supermarkets; Objects: residential buildings; Weight: # people in a building; Query region: Boston; Which wireless station in Boston is the most influential among mobile users? 9/2/2005 VLDB 2005, Trondheim, Norway 4
Example o 2 o 1 s 2 o 4 s 3 o 5 s 4 o 3 o 6 Suppose all objects have weight = 1, Q is the whole space, and t = 1. p The most influential site is s 1, with influence = 3. p 9/2/2005 VLDB 2005, Trondheim, Norway 5
Example o 2 o 1 s 2 o 4 s 3 o 5 s 4 o 3 o 6 Now that Q is the shadowed rectangle and t = 2. p Top-2 most influential sites: s 4 and s 2. p 9/2/2005 VLDB 2005, Trondheim, Norway 6
Outline Problem Definition p Related Work p The New Metric: min. Exist. DNN p Data Structures and Algorithm p Experimental Results p Conclusions p 9/2/2005 VLDB 2005, Trondheim, Norway 7
Related Work Bi-chromatic RNN query: considers two datasets, sites and objects. p The RNNs of a site s S are the objects that consider s as the nearest site. p o 2 o 1 s 1 9/2/2005 s 2 o 4 s 3 o 5 s 4 o 3 VLDB 2005, Trondheim, Norway o 6 8
Related Work p Solutions to the RNN query based on precomputation [KM 00, YL 01]. o 2 o 1 s 1 9/2/2005 s 2 o 4 s 3 o 5 s 4 o 3 VLDB 2005, Trondheim, Norway o 6 9
Related Work p Solution to RNN query based on Voronoi diagram [SRAE 01]. n n 9/2/2005 Compute the Voronoi cell of s: a region enclosing the locations closer to s than to any other sites. Querying the object R-tree using the Voronoi cell. VLDB 2005, Trondheim, Norway 10
Related Work [SRAE 01] o 2 o 1 s 1 9/2/2005 s 2 o 4 s 3 o 5 s 4 o 3 VLDB 2005, Trondheim, Norway o 6 11
Our Problem vs. RNN Query p RNN query: n n p A single site as an input. Interested in the actual set of the RNNs. Top-t most influential sites query: n n 9/2/2005 A spatial region as an input. Interested in the aggregate weight of RNNs. VLDB 2005, Trondheim, Norway 12
Straightforward Solution 1 For each site, pre-compute its influence. p At query time, find the sites in Q and return the t sites with max influences. p Drawback 1: Costly maintenance upon updates. p Drawback 2: binding a set of sites closely with a set of objects. p 9/2/2005 VLDB 2005, Trondheim, Norway 13
Straightforward Solution 2 p An extension of the Voronoi diagram based solution to the RNN query. 1. 2. 3. 9/2/2005 Find all sites in Q. For each such site, find its RNNs by using the Voronoi cell, and compute its influence. Return the t sites with max influences. VLDB 2005, Trondheim, Norway 14
Straightforward Solution 2 p Drawback 1: All sites in Q need to be retrieved from the leaf nodes. p Drawback 2: The object R-tree and the site R-tree are browsed multiple times. n n 9/2/2005 For each site in Q, browse the site R-tree to compute the Voronoi Cell. For each such Voronoi Cell, browse the object R-tree to compute the influence. VLDB 2005, Trondheim, Norway 15
Features of Our Solution Systematically browse both trees once. p Pruning techniques are provided based on a new metric, min. Exist. DNN. p No need to compute the influences for all sites in Q, or even to locate all sites in Q. p 9/2/2005 VLDB 2005, Trondheim, Norway 16
Outline Problem Definition p Related Work p The New Metric: min. Exist. DNN p Data Structures and Algorithm p Experimental Results p Conclusions p 9/2/2005 VLDB 2005, Trondheim, Norway 17
Motivation p p Intuitively, if some object in Oi may consider some site in Sj as an NN, Oi affects Sj. To estimate the influences of all sites in a site MBR Sj, we need to know whether an object MBR Oi will affect Sj. O 1 O 2 S 1 S 2 O 1 only affects S 1, while O 2 affects both S 1 and S 2. 9/2/2005 VLDB 2005, Trondheim, Norway 18
max. Dist – A Loose Estimation If max. Dist(O 1, S 1) < min. Dist(O 1, S 2), O 1 does not affect S 2. p Why not good enough? p min. Dist(O 1, S 2)=8 S 2 O 1 S 1 9/2/2005 max. Dist(O 1, S 1)=10 VLDB 2005, Trondheim, Norway 19
min. Max. Dist – A Tight Estimation? min. Dist(o 1, S 2) = 6 S 1 o 1 S 2 min. Max. Dist(o 1, S 1) = 5 p An object o does not affect S 2, if there exists S 1 such that min. Max. Dist(o 1, S 1) < min. Dist(o 1, S 2) 9/2/2005 VLDB 2005, Trondheim, Norway 20
min. Max. Dist – A Tight Estimation? min. Dist(O 1, S 2) = 6 s 1 S 2 O 1 7 6 o 1 s 2 min. Max. Dist(O 1, S 1) = 5 p Not true for an object MBR O 1. 9/2/2005 VLDB 2005, Trondheim, Norway 21
A Tight Estimation? p A metric m(O 1, S 1) should: 1) 2) 9/2/2005 guarantee that, each location in O 1 is within m(O 1, S 1) of a site in S 1, and be the smallest distance with this property. VLDB 2005, Trondheim, Norway 22
New Metric – min. Exist. DNNS 1(O 1) Definition: min. Exist. DNNS 1(O 1) = max {min. Max. Dist(l, S 1) | location l O 1} p p O 1 does not affect S 2, if there exists S 1, s. t. min. Exist. DNNS 1(O 1) < min. Dist(O 1, S 2). 9/2/2005 VLDB 2005, Trondheim, Norway 23
Examples of min. Exist. DNNS 1(O 1) O 1 S 1 p S 1 How to calculate it? 9/2/2005 VLDB 2005, Trondheim, Norway 24
Calculating min. Exist. DNNS 1(O 1) p Step 1: Space partitioning P 1: b P 2: c P 3: a a P 4: d c S 1 b P 8: a 9/2/2005 d P 7: d P 6: b Every location l in the same partition is associated with the second closest corner of S 1 – the distance is min. Max. Dist(l, S 1)! P 5: c VLDB 2005, Trondheim, Norway 25
Space Partitioning p O 1 is divided into multiple sub-regions, one in each partition. P 1: b P 2: c O 1 a c S 1 b 9/2/2005 d VLDB 2005, Trondheim, Norway 26
Calculating min. Exist. DNNS 1(O 1) p p Step 2: Choose up-to 8 locations on O 1’ border and compute the min. Max. Dist’s to S 1. min. Exist. DNN is the largest one! P 1: b P 2: c O 1 min. Exist. DNNS 1(O 1) a c S 1 b 9/2/2005 VLDB 2005, Trondheim, Norway d 27
Outline Problem Definition p Related Work p The New Metric: min. Exist. DNN p Data Structures and Algorithm p Experimental Results p Conclusions p 9/2/2005 VLDB 2005, Trondheim, Norway 28
Data Structure Two R-trees: S of sites, O of objects. p Three queues: p n n n 9/2/2005 queue. SIN: entries of S inside Q. queue. SOUT: entries of S outside Q. queue. O: entries of O. VLDB 2005, Trondheim, Norway 29
Data Structure S 3 O 2 O 1 S 1 Q S 4 S 2 p p p O 3 O 4 queue. SIN: S 1 S 2 queue. O: O 1 queue. SOUT: S 3 9/2/2005 VLDB 2005, Trondheim, Norway 30
max. Influence and min. Influence p For each entry Sj in queue. SIN, n n p max. Influence: total weight of entries in queue. O that affect Sj. min. Influence: total weight of entries in queue. O that ONLY affect Sj, divided by the number of objects in Sj. queue. SIN is sorted in decreasing order of max. Influence. 9/2/2005 VLDB 2005, Trondheim, Norway 31
Algorithm Overview p Expand an entry from one of the three queues. n n n p Remove the entry from the queue. Retrieve the referenced node, and insert the (unpruned) entries into the same queue. Update max. Influence and min. Influence if necessary. If top-t entries in queue. SIN are sites, with min. Influences ≥ max. Influences of all remaining entries, return. 9/2/2005 VLDB 2005, Trondheim, Norway 32
Example S 3 O 5 S 8 p S 9 p p O 6 O 1 S 5 S 1 Q S 6 S 7 p p p queue. SIN: S 1 queue. O: O 1 queue. SOUT: S 3 queue. SIN: S 5, S 7 queue. O: O 6 queue. SOUT: S 9 S 6 is not affected by O 1, prune S 6. O 5 does not affect S 5 and S 7, prune O 5. 9/2/2005 VLDB 2005, Trondheim, Norway 33
A Pruning Case min. Exist. DNNS 3(O 1)=4 Expand S 1 S 4 min. Dist(S 2, O 1)=5 S 3 S 2 O 1 min. Exist. DNNS 1(O 1)=7 p S 2 is pruned because of min. Exist. DNNS 3(O 1) < min. Dist(S 2, O 1) 9/2/2005 VLDB 2005, Trondheim, Norway 34
Choosing an Entry to Expand p Expand top entries in queue. SIN. p Expand the most important Oi. n p Importance: |Oi| * #affected entries * area(Oi) Expand Sj that contains the most important Oi. 9/2/2005 VLDB 2005, Trondheim, Norway 35
Choosing an Entry to Expand p Estimate the probability of pruning Oi using some Sj in queue. SOUT. Q Q S 1 min. Dist(S 1, O 1)=5 min. Exist. DNNS 2(O 1)=6 p S 1 O 1 min. Dist(S 1, O 1)=5 O 1 S 2 min. Exist. DNNS 2(O 1)=6 S’ 2 After expanding S 2, O 1 is likely not to affect S 1. 9/2/2005 VLDB 2005, Trondheim, Norway 36
Outline Problem Definition p Related Work p The New Metric: min. Exist. DNN p Data Structures and Algorithm p Experimental Results p Conclusions p 9/2/2005 VLDB 2005, Trondheim, Norway 37
Experimental Setup p Data sets: n n 24, 493 populated places in North America 9, 203 cultural landmarks in North America R-tree page size: 1 KB p LRU buffer: 128 disk pages. p t = 4. p p Comparing to the solution using Voronoi diagram. 9/2/2005 VLDB 2005, Trondheim, Norway 38
Selected Experimental Results #sites : #objects = 1 : 2. 5 9/2/2005 VLDB 2005, Trondheim, Norway 39
Selected Experimental Results #sites : #objects = 2. 5 : 1 9/2/2005 VLDB 2005, Trondheim, Norway 40
Outline Problem Definition p Related Work p The New Metric: min. Exist. DNN p Data Structures and Algorithm p Experimental Results p Conclusions p 9/2/2005 VLDB 2005, Trondheim, Norway 41
Conclusions We addressed a new problem: Top-t most influential sites query. p We proposed a new metric: min. Exist. DNN. It can be used to prune search space in NN/RNN related problems. p We carefully designed an algorithm which systematically browses both R-trees once. p Experiments showed more than an order of magnitude improvement. p 9/2/2005 VLDB 2005, Trondheim, Norway 42
Thank you! Q&A 9/2/2005 VLDB 2005, Trondheim, Norway 43
- Wo men de tian fu
- European physical features
- Spatial data vs non spatial data
- Christian louboutin louis benech
- Outliers and influential points
- Instrumental power definition
- Spatial computing
- Background for eportfolio
- Tian tan buddha
- E e e qu xiang xiang tian ge
- Nato no fly zone
- Unit 6 lesson 3
- Conventional computing and intelligent computing
- Botox injections dubai
- Alpha 1 4 glycosidic bond
- Www.sites.google.com
- Indus valley civilization sites map
- Volutrol
- Epaxial injection dog
- Characteristics of pulse
- The major storage sites for glycogen are the
- Hcpss me
- Brownfield sites canary wharf
- What are the 7 vital signs
- Advantages of greenfield sites
- Iv sites
- Trochlear nerve
- Iv sites
- Sanctuary sites in the body
- Dorsogluteal im injection
- Intradermal injection sites
- Subcutaneous injection site
- Https //sites.google.com games
- Activitati transdisciplinare învăţământ primar
- Which of the following practices
- Mississippi civil war sites map
- The assistant chapter 26
- Youtubenn
- What is meant by industrial estate
- Landmark for vastus lateralis im injection
- Dylan ryder teacher
- Https://sites google
- Les cristaux métalliques
- Respiratory rate of newborn