ClusterCentric Anomaly Detection and Characterization in Spatial Time

  • Slides: 29
Download presentation
Cluster-Centric Anomaly Detection and Characterization in Spatial Time Series Dr. Hesam Izakian October 2014

Cluster-Centric Anomaly Detection and Characterization in Spatial Time Series Dr. Hesam Izakian October 2014

Outline Spatial time series Problem formulation Anomaly detection in spatial time seriesquestions Overall scheme

Outline Spatial time series Problem formulation Anomaly detection in spatial time seriesquestions Overall scheme of the proposed method o Time series segmentation o Spatial time series clustering o Assigning anomaly scores to clusters o Visualizing the propagation of anomalies An outbreak detection scenario Application 2

Spatial time series Structure of data o A set of spatial coordinates o One

Spatial time series Structure of data o A set of spatial coordinates o One or more time series for each point Examples o Daily average temperature in different climate stations o Stock market indexes in different countries o Number of absent students in different schools o Number emergency department visits in different hospitals o Measured signals in different parts of brain 3

Problem formulation There are N spatial time series Objective: Find a spatial neighborhood of

Problem formulation There are N spatial time series Objective: Find a spatial neighborhood of data In a time interval Containing a high level of unexpected changes 4

Anomaly detection in spatial time series- questions Spatial neighborhood of data o Size of

Anomaly detection in spatial time series- questions Spatial neighborhood of data o Size of neighborhood o Overlapping neighborhoods Unexpected changes (anomalies) o What kind of changes are expected/not expected o How to evaluate the level of unexpected changes Anomaly visualization Anomaly characterization o What was the source of anomaly o How the anomaly is propagated over time 5

Overall scheme of the proposed method Revealing the structure of data in various time

Overall scheme of the proposed method Revealing the structure of data in various time intervals Comparing the revealed structures Spatial time series data Spatial time series clustering Sliding window Anomaly scores Fuzzy relations 6

Time series part segmentation Sliding window o Spatio-temporal subsequences o Local view of time

Time series part segmentation Sliding window o Spatio-temporal subsequences o Local view of time series part 7

Overall scheme of the proposed method Revealing the structure of data in various time

Overall scheme of the proposed method Revealing the structure of data in various time intervals Comparing the revealed structures Spatial time series data Spatial time series clustering Sliding window Anomaly scores Fuzzy relations 8

Fuzzy C-Means clustering- visual illustration 9

Fuzzy C-Means clustering- visual illustration 9

Fuzzy C-Means clustering- visual illustration 10

Fuzzy C-Means clustering- visual illustration 10

Fuzzy C-Means clustering… Partitions N data Into clusters Result: Objective function: Minimization: 11

Fuzzy C-Means clustering… Partitions N data Into clusters Result: Objective function: Minimization: 11

Spatial time series clustering Reveals available structure within data o In form of partition

Spatial time series clustering Reveals available structure within data o In form of partition matrices Challenges o Different sources: Spatial part vs. temporal part o Different dimensionality in each part o Different structure within each part 12

Spatial time series clustering… In spatial time series, we define Adopted FCM objective function

Spatial time series clustering… In spatial time series, we define Adopted FCM objective function Characteristics o When λ=0: Only spatial part of data in clustering o A higher value of λ : a higher impact of time series part in clustering o Optimal value of λ: Optimal impact of each part in clustering 13

Spatial-time series clustering. Optimal value of λ 14

Spatial-time series clustering. Optimal value of λ 14

Overall scheme of the proposed method Revealing the structure of data in various time

Overall scheme of the proposed method Revealing the structure of data in various time intervals Comparing the revealed structures Spatial time series data Spatial time series clustering Sliding window Anomaly scores Fuzzy relations 15

Assigning anomaly scores to clusters in different time windows Assign an anomaly score to

Assigning anomaly scores to clusters in different time windows Assign an anomaly score to each single subsequence based on historical data Aggregating anomaly scores inside revealed clusters 16

Overall scheme of the proposed method Revealing the structure of data in various time

Overall scheme of the proposed method Revealing the structure of data in various time intervals Comparing the revealed structures Spatial time series data Spatial time series clustering Sliding window Anomaly scores Fuzzy relations 17

Visualizing the propagation of anomalies- Fuzzy relations Objective: quantifying relations between clusters 18

Visualizing the propagation of anomalies- Fuzzy relations Objective: quantifying relations between clusters 18

Visualizing the propagation of anomalies… Objective function to construct relation Optimization 19

Visualizing the propagation of anomalies… Objective function to construct relation Optimization 19

Example An outbreak o In southern part of Alberta o Using NAADSM for 100

Example An outbreak o In southern part of Alberta o Using NAADSM for 100 days 20

Example… A sliding window is used o Length : 20 o Movement: 10 Generated

Example… A sliding window is used o Length : 20 o Movement: 10 Generated spatio-temporal subsequences: 21

22

22

Example… 23

Example… 23

Example… 24

Example… 24

Example… 25

Example… 25

Example… 26

Example… 26

Application Implemented for Agriculture and Rural Development (Government of Alberta) Using KNIME (Konstanz Information

Application Implemented for Agriculture and Rural Development (Government of Alberta) Using KNIME (Konstanz Information Miner) Animal health surveillance in Alberta Anomaly detection Data visualization 27

Conclusions A framework for anomaly detection and characterization in spatial time series is developed

Conclusions A framework for anomaly detection and characterization in spatial time series is developed A sliding window to generate a set of spatio-temporal subsequences is considered Clustering is used to discover the available structure within the spatio-temporal subsequences An anomaly score assigned to each revealed spatiotemporal cluster A fuzzy relation technique is proposed to quantify the relations between clusters in successive time steps 28

Thank you 29

Thank you 29