Diversified Spatial Keyword Search On Road Networks Computer

  • Slides: 22
Download presentation
Diversified Spatial Keyword Search On Road Networks Computer Science and Engineering Chengyuan Zhang 1,

Diversified Spatial Keyword Search On Road Networks Computer Science and Engineering Chengyuan Zhang 1, Ying Zhang 2, 1, Wenjie Zhang 1, Xuemin Lin 3, 1, Muhammad Aamir Cheema 4, 1, Xiaoyang Wang 1, 1 The University of New South Wales, Australia 2 QCIS, University of Technology, Sydney 3 East China Normal University 4 Monash University 1

Outline Ø Motivation Ø Problem Statement Ø SK Search on Road Network Ø Diversified

Outline Ø Motivation Ø Problem Statement Ø SK Search on Road Network Ø Diversified SK search on Road Network Ø Experiments Ø Conclusion 2 2

Motivation Ø Massive amount of spatio-textual objects have emerged in many applications Ø Road

Motivation Ø Massive amount of spatio-textual objects have emerged in many applications Ø Road network distance is employed in many key application e. g. , location based service Ø Strong preference on spatially diversified result e. g. , dissimilarity reasonably large diversified spatial keyword search on road networks 3

Motivation Example Ø Tourist Aim v A nice dinner v Visit nearby attractions or

Motivation Example Ø Tourist Aim v A nice dinner v Visit nearby attractions or shops v No idea with attractions or shop until some restaurants suggested Ø Preferred v K close restaurants satisfy dinner requirements v Restaurants welled distributed Ø Result v P 1, P 4 might be a better choice v Provide more attractions or shops with a slight sacrifice in relevance 4 K=2, q. T={pancake, lobster}

Problem Statement Ø SK Query v Given a road network G, and a set

Problem Statement Ø SK Query v Given a road network G, and a set of spatio-textual objects, a query point q which is also a spatio-textual objects, and a network distance δmax, a spatial keyword query retieves objects each of which contains all query keywords of q and is within network distance δmax from q. T=t 1, t 2 δmax =20 Result: O 1 , O 2 , O 8 5 5

Problem Statement • 6

Problem Statement • 6

Example S 1 = {O 1, O 2} S 2 = {O 1, O

Example S 1 = {O 1, O 2} S 2 = {O 1, O 8} S 3 = {O 2, O 8} T=t 1, t 2 K=2 , δmax =20 λ=0. 6 7 0. 29 0. 475 0. 465

SK Search On Road Network • 8 8

SK Search On Road Network • 8 8

Example T=t 1, t 2 δmax =20 9 Priority Queue n n 314 nn

Example T=t 1, t 2 δmax =20 9 Priority Queue n n 314 nn n 132 n 5 Marked Nodes n 4 n 3 n 1 Pass Object O 81 O 2 O 8 Marked Object O 1 O 2 O 8 n 67 n 2 n 7

Enhancement of Signature Technique Ø Observation v Avoid loading objects resulted from false hit

Enhancement of Signature Technique Ø Observation v Avoid loading objects resulted from false hit Ø Aim v Find a partition of e with c cuts which has the minimal false hit cost. v Propose a dynamic programming based technique to partition objects lying on an edge. q. T=t 2 , t 4 v `Cost- forbidden in practice I(e, t 2 )=1 I(e 1 , t 2 )=1 I(e 2 , t 2 )=0 v Greedy heuristic: at each iteration, find a cutting position which the cost of the refine partition is minimized. I(e, t 4 )=1 I(e 1 , t 4 )=0 I(e 2 , t 4 )=1 10 Pass test False hit Fail test 10

Diversified SK Search On Road Network • 11 11

Diversified SK Search On Road Network • 11 11

Incremental Diversified SK Search Ø Drawback v Invoked diversified algorithm after all objects satisfying

Incremental Diversified SK Search Ø Drawback v Invoked diversified algorithm after all objects satisfying spatial keyword constraint are retrieved v Expensive to compute pair-wise diversification distances, not pre-computation and specific restrictions Ø Aim v prune some non-promising objects based on the diversification distance during search 12 12

Incremental Diversified SK Search • 13 13

Incremental Diversified SK Search • 13 13

Example Core Pair O 1 O 42 Visited object O 3 O 2 K=2

Example Core Pair O 1 O 42 Visited object O 3 O 2 K=2 , δ max =20 λ =0. 6 λ increases, Performance increases 14 O 5 O 17 f(S(O 1, O 2))=0. 99 f(S(O 1, O 3))=0. 96 f(S(O 2, O 3))=0. 97 f(S(O 1, O 4))=1. 09 f(S(O 2, O 4))=1. 08 f(S(O 3, O 4))=1. 07 Baseline: 19! Incremental: 6!

Experimental Setting Ø Implemented in Java Ø Debian Linux o Intel Xeon 2. 40

Experimental Setting Ø Implemented in Java Ø Debian Linux o Intel Xeon 2. 40 GHz dual CPU o 4 GB memory Ø Dataset o NA: US Board on Geographic Names + North America Road Network (Default) o SF: Spatial locations from Rtree-Portal + Textual content randomly generate from 20 Newsgroups + San Francisco Road Network o TW: 11. 5 millions tweets with geo-locations from May 2012 to August 2012 + San Francisco Bay Area Road Network o SYN: Synthetic Data + San Francisco Road Network 15 15

Algorithms Evaluated Ø IR – A natural extension of the spatial object indexing method

Algorithms Evaluated Ø IR – A natural extension of the spatial object indexing method in VLDB 2003 Ø IF – Inverted indexing technique Ø SIF – Signature-based inverted indexing technique Ø SIFP – Enhanced SIF by partition technique Ø SEQ – A straightforward implementation of the diversified spatial keyword search algorithm Ø COM – The incremental diversified spatial keyword search algorithm • Query (500) : location , #l query keywords • Evaluate Response time and # I/O 16 16

SK Search on Diff. Dataset 17 17

SK Search on Diff. Dataset 17 17

 (a) Varying l 18 18

(a) Varying l 18 18

Diversified SK Search on Diff. Dataset 19 19

Diversified SK Search on Diff. Dataset 19 19

Conclusion v Formally define the problem of diversified spatial keyword search on road networks

Conclusion v Formally define the problem of diversified spatial keyword search on road networks v Propose a signature-based inverted indexing technique on road network. v Develop effective spatial keyword pruning and diversity pruning techniques to eliminate non-promising objects v Extensive experiment on both real and synthetic data Future work v Extend to diversified ranked spatial keyword query on road networks 20 20

Thank you! 21 21

Thank you! 21 21

Evaluation on different parameter 22

Evaluation on different parameter 22