31 Shapebased Similarity Query for Trajectory of Mobile
/ 31 Shape-based Similarity Query for Trajectory of Mobile Object NTT Communication Science Laboratories, NTT Corporation, JAPAN. Yutaka Yanagisawa Jun-ichi Akahani Tetsuji Satoh “Rickshaw” (NARA/KYOTO)
2 / 31 Background • The Recent technologies allow us to track moving objects using highly accurate positioning devices. • There are many applications using such location information have been developed. – Navigation Systems, Location-based Information Systems, etc. A Navigation System Digital City Kyoto: A Location-based Information System
3 / 31 Motion Pattern Analysis • Motion pattern analysis is one of the most interesting technologies of these applications. – By analyzing their motion patterns, it is possible to extract the behavioral characteristics of moving objects. – The applications can predict the future behavior of the moving objects using extracted characteristics. • Single Motion Analysis – focuses on the statistical characteristics of a moving object. • Relative Motion Analysis – focuses on the similarity between motion patterns. We discuss the approach based on the similarity of trajectory shapes because it is a simple and intuitive approach.
4 / 31 Similarity of Trajectory Shapes • This approach is called shape-based approach. An example of an information providing system. B Exhibition hall The trajectories of visitors are stored in a database. C It retrieves trajectories that are similar to A … A Entrance D Show information about B, C, and D. Exit It is possible to predict the future route of the new visitor.
5 / 31 Problem • However, there are few database systems which can search trajectories based on shapes. – Many database systems retrieve moving objects based on only “distance”. Y L The minimum distance between L and L 2 is less than the distance between L and L 1 L 2 D 2 However, L 2 is more similar in shape to L than L 1 intuitively. D 1 Not appropriate for shape-based approach X
6 / 31 Our Approaches • We propose a shape-based similarity query for searching trajectories from moving object databases. • Moreover, we present an efficient indexing method for retrieving moving objects based on our proposed query.
/ 31 Shape-based Similarity Query
8 / 31 Data Model for Trajectory • In real world, a trajectory of a moving object can be modeled as a continuous line in space. – However, positioning devices can not track a moving object continuously. • In our work, a trajectory is stored as a sequence of points (discrete line) in databases. – This model is used as a popular data model. In Real World In Databases
9 / 31 A Similarity of Time Series Data • The key idea have been proposed in the technique for time series database. • The similarity between two time series data is defined as the Euclidean distance between the points in n dimensional space. W = <w 1, w 2, …, w 9> W’ = <w’ 1, w’ 2, …, w’ 9> x (n=9) w 9 x D(W, W’) w 1 w 2 w’ 9 w’ 1 t 8 t t
10 / 31 A Similarity of Time Series Data • The key idea have been proposed in the techniques for time series database. • The similarity between two time series data is defined as the Euclidean distance between the points in n dimensional space. Distance between W and W’ is 5 W = <2, 3, 4, 3> W’ = <1, 1, 2, 3> 5 4 4 3 3 2 2 1 1 =3 0 2 1 2
11 / 31 A Similarity of Time Series Data • The key idea have been proposed in the techniques for time series database. • The similarity between two time series data is defined as the Euclidean distance between the points in n dimensional space. 5 4 W = <2, 2, 2, 3> W’ = <2, 2, 2, 3> In this case, the distance is zero. 3 2 1 Smaller distance means higher similarity
12 / 31 A Similarity of Time Series Data • The distance fits to intuitive similarity of line shapes. • There is an effective search algorithm to calculate this distance. • We will extend the similarity for trajectory in 2 or more dimensional space.
Our Proposed Similarity of Trajectories • The similarity of trajectories can be defined as an extension of the distance of time series data. – The distance can be given as the following expression. Y p 7 p’ 7 l l’ p 1 p’ 1 X 13 / 31
Shape-based Similarity Query for Trajectories 14 / 31 • We define a shape-based similarity query for trajectories as a subsequence matching query. – Because the length of trajectories are often difference. SSQ(L, l, q) L: A set of stored trajectories in database. l: A trajectory to be compared. q: The distance from l. Answer La: A set of sub-trajectories + Stored trajectories A given trajectory Answer trajectories
Shape-based Similarity Query for Trajectories 15 / 31 An Answer Sub-Trajectory A Given Trajectory - The database calculates the distance between the given trajectory l and each sub-trajectory. - If the distance is less than the given distance q, the database adds the sub-trajectory to the answer set of trajectories La.
/ 31 Indexing
17 / 31 Approach • The existing spatial structures are appropriate for retrieving an object based on the distance. – However, these structures have no method for searching the data based on the similarity between trajectories. We extend the spatial data structure for our proposed query.
An Efficient Calculation Process for the Shape-based Similarity: 1 18 / 31 • The essential idea was presented as a PAA: Piecewise Aggregate Approximation [Keogh 01] . – PAA is an efficient method of approximating the time series data for a similarity search. Using the `average sequences’ of a sub-sequences. x x (N=3) W W‘ t t
An Efficient Calculation Process for the Shape-based Similarity: 2 19 / 31 • The distance between the average sequences is the lower bound of the distance between the original two sequences. x D( W, W’) W W’ By comparing average sequences, we can know the lower bound of the distance between original sequences. t t
An Efficient Calculation Process for the Shape-based Similarity: 3 20 / 31 • In the case of trajectories, the distance between the center points of trajectories is the lower bound of the distance between the original trajectories. v 1 Y L Y The distance between these center points is the lower bound of the original distance. v’ 1 L’ v 7 v’ 7 Calculating center points X X
21 / 31 Combination of PAA and Spatial Data Structure: 1 • For making indexes, the database calculates the center points of sub-trajectories. – The length of each sub-trajectory must be fixed to the system parameter N. – In this example, N is four. l Y Y p 8 l p 7 p 2 p 3 p 6 p 5 The center point of from p 5 to p 8 p 4 The center point of from p 1 to p 4 p 1 X X
22 / 31 Combination of PAA and Spatial Data Structure: 2 • Next, the database makes indexes to the points using a traditional spatial data structure. – Our implemented system makes an index to every center point using R+-Tree. The database can search objects based on the similarity of trajectories using the spatial data structure. Y Normal R+-Tree X Y Y X X Our Proposed Index Structure
23 / 31 Query Processing: 1 • When a SSQ(L, l. Q, q) is given, the database calculates the center point of l. Q at first. – Suppose that the length of stored center points is fixed to 4 (N=4) in the following example. If a query SSQ(L, l. Q, q) is given. . Y Y l. Q p. Q is the center point of a given trajectory. p. Q X X
24 / 31 Query Processing: 2 • Next, the database searches stored points within the distance q from the calculated point p. Q using the spatial data structure. Y An index tree (R+-Tree) A A Candidate points C p. Q B B C X The region within the distance q from p. Q
25 / 31 Query Processing: 3 • Finally, the database checks the distance between a given trajectory l. Q and each candidate trajectory. • If the distance is less than a given threshold q, the candidate trajectory is added to the answer set La. Y If , l 1 is added to La p 2 Y l. Q p. Q l 1 p 1 X l 1 is the original trajectory of p 1.
/ 31 Performance Study
27 / 31 Performance Study • We conducted an experiment for evaluating our proposed query and indexing method. – Measuring the processing time for retrieving trajectories required by a shape-based similarity query. • For this evaluation, two types of trajectories are stored in a database. – tracked by GPS and generated by a simulator. We compared the processing time using both methods: - Our indexing method, - A spatial data structure (R+-Tree).
28 / 31 Trajectory Data: 1 • This is an example of trajectory data captured by GPS receivers on rickshaws (in Nara city). – Rickshaw is tour guide, they work in Nara / Kyoto. A trajectory of a rickshaw in all day A GPS Receiver (e. Trex/GARMIN) Rickshaw
29 / 31 Trajectory Data: 2 • This figure displays trajectories generated by our implemented simulator. The simulator can generate trajectories such that people walk on a plane freely. Velocity and direction of each object are given as random values. But the changes of these values are slow and continuous.
30 / 31 The Result of the Experiment The processing time to calculate 10 random queries is displayed: Using our index structure Using R+-Tree Amount of Stored Points Length of l. Q (=N) For retrieving longer trajectories from stored data, our proposed method has high advantages to existing methods.
31 / 31 Conclusions • We have proposed a shape-based similarity query to find moving objects. – Database users can find moving objects for analyzing their motion patterns. • Moreover, we have presented an effective indexing method to search for the trajectories required by our proposed queries. – We demonstrated the advantage of our proposed method to existing spatial data structures.
32 / 31 Future Work • We will evaluate our proposed method using these data. [Human Tracking by using Laser Scanners] - University of Tokyo (Dr. Zhao and Prof. Shibasaki) - Captured at Geoinformation Forum Japan 2002 (32. 096 people visited) [Motion Capture Data] - Tokyo University of Technology (Creative Labo) - 76 moving points on bodies (120 fps) - Playing football and judo
- Slides: 32