ICDE 2008 Trajectory Outlier Detection A PartitionandDetect Framework
- Slides: 36
ICDE 2008 Trajectory Outlier Detection: A Partition-and-Detect Framework April 8, 2007 Jae-Gil Lee, Jiawei Han, and Xiaolei Li Department of Computer Science University of Illinois at Urbana-Champaign 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework
Table of Contents = Motivation = Partition-and-Detect Framework = Outlier Detection Algorithm: TRAOD • Partitioning Phase (Simple) • Detection Phase • Partitioning Phase (Enhanced) = Performance Evaluation = Related Work = Conclusions 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 2
Outlier Detection = Definition: the process of detecting a data object that is grossly different from or inconsistent with the remaining set of data = Applications: the detection of credit card fraud, the monitoring of criminal activities in electronic commerce, etc. = Algorithms: distribution-based, distance-based, density-based, and deviation-based = Target data: previous research has mainly dealt with outlier detection of point data 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 3
Analysis on Trajectory Data = Tremendous amounts of trajectory data of moving objects are being collected • Example: vehicle positioning data, hurricane tracking data, animal movement data, etc. = Trajectory outlier detection has many important, real-world applications • Detection of suspicious persons in video surveillance • Analysis of unusual air-mass trajectories in meteorology • … A powerful outlier detection algorithm for trajectories is needed urgently 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 4
Limitations of Existing Algorithms = Knorr et al. [5] have presented one of very few attempts • Define the distance between two whole trajectories using the summary information (e. g. , the coordinates of the starting and ending points) • Apply a distance-based approach to detection of trajectory outliers = Existing algorithms might not be able to detect outlying portions of trajectories • Example: TR 3 is not detected as an outlier since its overall behavior is similar to those of neighboring trajectories TR 5 TR TR 3 4 TR TR 1 2 04/08/08 An outlying sub-trajectory Trajectory Outlier Detection: A Partition-and-Detect Framework 5
Discovery of Outlying Sub-Trajectories = Discovery of outlying sub-trajectories is very useful in the real world • Example: Sudden changes in hurricane’s path [10] We propose the partition-and-detect framework 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 6
The Partition-and-Detect Framework = Consists of two phases: partitioning and detection TR 5 TR 4 TR 3 TR TR 1 2 (1) Partition A set of trajectories A set of trajectory partitions (2) Detect TR 3 An outlier Outlying trajectory partitions Note: A set of outlying trajectory partitions indicates an outlying subtrajectory 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 7
The Problem Statement = Given a set of trajectories I = {TR 1, …, TRn}, our algorithm generates a set of outliers O = {O 1, …, Om} with outlying trajectory partitions for each Oi = Necessary definitions: • A trajectory is a sequence of multi-dimensional points, which is denoted as TRi = p 1 p 2 p 3 … pj … pleni; a trajectory partition (t-partition for short) is a line segment pipj (i < j), where pi and pj are the points chosen from the same trajectory • A t-partition is outlying if it does not have a sufficient number of similar neighbors • A trajectory is an outlier if it contains a non-negligible amount of outlying t-partitions 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 8
The Outlier Detection Algorithm: TRAOD = Based on the partition-and-detect framework Algorithm TRAOD (TRAjectory Outlier Detection) Input: A set of trajectories I = {TR 1, …, TRn} Output: A set of outliers O = {O 1, …, Om} with outlying t-partitions for each Oi Algorithm: /* Partitioning Phase */ 01: for each TR I do 02: Partition TR into a set L of line segments; 03: Accumulate L into a set D; /* Detection Phase */ 04: for each P D do 05: Mark P if it is an outlying t-partition; 06: for each TR I do 07: Output TR if it is an outlier; 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 9
Where We Are Now /* Partitioning Phase */ 01: for each TR I do 02: Partition TR into a set L of line segments by a simple strategy; by a two-level partitioning strategy; 03: Accumulate L into a set D; /* Detection Phase */ 04: for each P D do 05: Mark P if it is an outlying t-partition; 06: for each TR I do 07: Output TR if it is an outlier; 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 10
A Simple Partitioning Strategy (1/2) = Careless partitioning (especially, in a long length) could miss possible outliers • Example: Even though TRout behaves differently from its neighboring trajectories, these differences are averaged out due to careless partitioning A trajectory TRout A t-partition Neighboring Trajectories 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 11
A Simple Partitioning Strategy (2/2) = A trajectory is partitioned at a base unit: the smallest meaningful unit of a trajectory in a given application • Example: The base unit can be every single point A trajectory TRout An outlying t-partition Neighboring Trajectories A t-partition Pros: high detection quality in general Cons: poor performance due to a large number of t-partitions remedied by a two-level partitioning strategy 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 12
Where We Are Now /* Partitioning Phase */ 01: for each TR I do 02: Partition TR into a set L of line segments by a simple strategy; by a two-level partitioning strategy; 03: Accumulate L into a set D; /* Detection Phase */ 04: for each P D do 05: Mark P if it is an outlying t-partition; 06: for each TR I do 07: Output TR if it is an outlier; 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 13
Distance between T-Partitions = The weighted sum of three components: the perpendicular distance( ), parallel distance( ), and angle distance( ) • Adapted from similarity measures used in the domain of pattern recognition [13] 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 14
Trajectory Outliers Based on Distance (1/2) = Def. (a close trajectory): TRj is close to Li TRj is not close to Li = Def. (an outlying t-partition): Not close Close ≤ 1‒p Li is an outlying t-partition 04/08/08 > 1‒p Li is not an outlying t-partition Trajectory Outlier Detection: A Partition-and-Detect Framework 15
Trajectory Outliers Based on Distance (2/2) = Def. (an outlier): • A trajectory TRi is an outlier if the sum of the lengths of outlying t-partitions in TRi the sum of the lengths of all t-partitions in TRi TRj 04/08/08 ≥F TRi is an outlier TRj is not an outlier Trajectory Outlier Detection: A Partition-and-Detect Framework 16
Incorporation of Density (1/2) = The previous definition, as it is, has the local density problem • A t-partition in a dense region tends to have relatively a larger number of close trajectories than that in a sparse region T-Partitions in dense regions are favored! 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 17
Incorporation of Density (2/2) = Def. (the density of a t-partition): • The density of a t-partition Li is the number of t-partitions within the distance σ from Li, where σ is the standard deviation of pairwise distances between t-partitions = Def. (the adjusting coefficient of a t-partition): adj(Li) = the average density of all t-partitions the density of the t-partition Li = Adjustment by the density • The number of close trajectories is multiplied by the adjusting coefficient adj(Li) < 1. 0 in a dense region adj(Li) > 1. 0 in a sparse region 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 18
Guidelines for Parameter Values = Three parameters: • D corresponds to similar, p to sufficient, and F to non-negligible = Remark: There is no universally correct parameter value even for the same data set and application = Our guideline: Resorts on user feedback 04/08/08 D Smaller p 0. 90 Have Many Trajectories? 0. 99 F 0. 10 Are Trajectories Short? 0. 20 Want Many Outliers? Trajectory Outlier Detection: A Partition-and-Detect Framework Larger 19
Where We Are Now /* Partitioning Phase */ 01: for each TR I do 02: Partition TR into a set L of line segments by a simple strategy; by a two-level partitioning strategy; 03: Accumulate L into a set D; /* Detection Phase */ 04: for each P D do 05: Mark P if it is an outlying t-partition; 06: for each TR I do 07: Output TR if it is an outlier; 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 20
Two-Level Trajectory Partitioning = Objective • Achieves much higher performance than the simple strategy • Obtains the same result as that of the simple strategy; i. e. , does not lose the quality of the result = Basic idea 1. Partition a trajectory in coarse granularity first 2. Partition a coarse t-partition in fine granularity only when necessary = Main benefit • Narrows the search space that needs to be inspected in fine granularity Many portions of trajectories can be pruned early on 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 21
Intuition to Two-Level Trajectory Partitioning = If the distance between coarse t-partitions is very large (or small), the distances between their fine t-partitions is also very large (or small) TRi Coarse-Granularity Partitioning Fine-Granularity Partitioning TRj Given two coarse t-partitions, can we know if the distance between any two fine t-partitions is greater than (or less than) D? 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 22
Coarse-Granularity Partitioning* = Try to maximize two rivalry measures • Preciseness: the difference between a trajectory and a set of its coarse tpartitions should be as small as possible − Required for making the bounds tight • Conciseness: the number of coarse t-partitions should be as small as possible − Required for reducing the number of comparisons = Formulate this problem using the minimum length description (MDL) principle • A good tradeoff between the two measures is found based on the information theory * Coarse-granularity partitioning is identical to that in our earlier work on trajectory clustering [15] 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 23
Fine-Granularity Partitioning = Identify outlying coarse t-partitions by deriving the distance bounds between two coarse t-partitions Li and Lj • Suppose li is a fine t-partition in Li and lj is that in Lj lb(Li, Lj, f) The lower bound of f(li, lj), ub(Li, Lj, f) The upper bound of f(li, lj), • Derive the above bounds separately for combine them (Lemma 4) Li TRi (Lemmas 1~3) and Lj TRj 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 24
Derivation of the Distance Bounds Lemma 1. Bounds for Lemma 2. Bounds for Lemma 3. Bounds for Combine Lemma 4. Bounds for dist(Li, Lj) 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 25
Pruning Rules for Fine-Granularity Partitioning = Rule 1: If lb(Li, Lj, dist) > D, fine-granularity partitioning is not required when comparing Li and Lj Li Lj lb(Li, L> D >D j, dist) = Rule 2: If ub(Li, Lj, dist) ≤ D, fine-granularity partitioning is required, but the distance between the fine t-partitions in Li and Lj needs not be computed Li Lj 04/08/08 ub(Li, ≤Lj. D , dist) ≤ D Trajectory Outlier Detection: A Partition-and-Detect Framework 26
Performance Evaluation = Use two real trajectory data sets • Hurricane track data set − Records the Atlantic hurricanes for the years 1950 through 2006 − The entire set: 608 trajectories and 18, 951 points; A small set (1990~2006): 221 trajectories and 7, 270 points • Animal movement data set − Records the locations of elk, deer, and cattle for the years 1993 through 1996 (the Starkey Project) − Elk 1993: 33 trajectories and 15, 422 points; Deer 1995: 32 trajectories and 20, 065 points; Cattle 1993: 41 trajectories and 19, 556 points = Validate the quality of outlier detection = Evaluate the effectiveness of the two-level partitioning strategy 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 27
Trajectory Outliers for Hurricane Data (Small) D = 85, p = 0. 95, F = 0. 2 → # of outliers = 13 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 28
Trajectory Outliers for Elk 1993 D = 55, p = 0. 95, F = 0. 1 → # of outliers = 3 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 29
Trajectory Outliers for Deer 1995 D = 80, p = 0. 95, F = 0. 1 → # of outliers = 3 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 30
Effects of Parameter Values (a) D = 83, p = 0. 95, F = 0. 2 19 outliers 10 outliers (b) D = 87, p = 0. 95, F = 0. 2 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 31
Pruning Power of Two-Level Partitioning 2 L-Total: the ratio of the number of pairs pruned by Rule 1 to the total number of pairs of coarse t-partitions 2 L-False: the proportion of pairs pruned incorrectly Optimal: the maximum ratio of pairs that can be pruned Achieves high pruning power (64~88%) 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 32
Speedup Ratio of Two-Level Partitioning the elapsed time of the algorithm using the simple partitioning strategy Speedup Ratio = the elapsed time of the algorithm using the two-level partitioning strategy Shows significant performance improvement 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 33
Related Work = Outlier detection algorithms for points • Distribution-based [2], distance-based [3, 4, 5, 6], density-based [7, 8], deviation-based [9] = Trajectory outlier detection technique using a distance-based approach [5] • Not clear whether this technique can detect outlying sub-trajectories from very complicated trajectories = Trajectory outlier detection algorithms based on classification [12] • Require a good training set and depend on training 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 34
Conclusions = Proposed a novel framework, the partition-and-detect framework, for detecting trajectory outliers = For the 1 st phase, proposed a two-level trajectory partitioning strategy • Ensures both high quality and high efficiency = For the 2 nd phase, proposed a hybrid of the distance-based and density-based approaches • Very intuitive, but does not have the local density problem = Demonstrated the effectiveness of TRAOD using various real trajectory data 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 35
Thank You! 04/08/08 Trajectory Outlier Detection: A Partition-and-Detect Framework 36
- Icde 2019
- The trajectory
- 2008 2008
- Outlier adalah
- Outlier
- What is the outlier test
- High leverage point vs outlier
- "outlier property group"
- Zxzy statistics
- Outlier data mining
- Outliers in logistic regression
- How to find mean mode median
- "outlier property group"
- Outlier
- Outlier in math
- Box plot applet
- Copyright glencoe/mcgraw-hill answer key
- What is the outlier test
- Cách loại bỏ outlier trong spss
- Common intrusion detection framework
- Framework decision 2008/913/jha
- Vertical component of projectile motion
- Bullet trajectory worksheet
- Cartesian space vs joint space
- Bullet trajectory lab
- Trajectory schema examples
- Vix projectile motion
- Refers to the path traced by a projectile during its motion
- Qdof
- The trajectory
- Cubic polynomial trajectory matlab
- Forensic science chapter 17 review answers
- An unscented trail chapter 17
- Trajectory formula
- Electric potential
- Backhand grip badminton
- The trajectory