Matrix Profile Examples While you can find motifs

  • Slides: 10
Download presentation
Matrix Profile Examples • While you can find motifs and discords (anomalies) without computing

Matrix Profile Examples • While you can find motifs and discords (anomalies) without computing the Matrix Profile, examining the full Matrix Profile can often give you extra insights and context. • In this document we present some more or less random examples of the Matrix Profile for diverse datasets, in order to sharpen your intuitions. • If you look in the notes box, you can find enough details to reproduce the demonstrations.

Discord Example ECG 1 (macecgdb/test 21_90 j) 500 Let us start with a simple

Discord Example ECG 1 (macecgdb/test 21_90 j) 500 Let us start with a simple example from a two-lead ECG trace. 1000 1500 2000 2500 3000 3500 4000 The first discord 30 20 10 0 In lead 1, the Matrix Profile seem to indicate a single anomaly. If we examine the second lead instead, the Matrix Profile is suggestive of two discords. 500 1000 1500 2000 2500 3000 3500 4000 ECG 2 The first discord The second discord 30 20 10 0 Subsequence length = 300 500 1000 1500 2000 2500 3000 3500 4000

Here the subsequence length was set to 150, but we still find these anomalies

Here the subsequence length was set to 150, but we still find these anomalies if we half or triple that length. Discord Example (MIT-BIH Long-Term ECG Database) Here we consider a much longer example than in the last slide. In this case there are two anomalies annotated by MIT cardiologists. The Matrix Profile clearly indicates them. 1000 2000 3000 4000 5000 The second discord: ectopic beat 20 15 10 6000 7000 The first discord: premature ventricular contraction 5 0 1000 2000 3000 4000 5000 6000 7000

Discord Example (Gait. Phase Database) Now, on the same plot, we place the top

Discord Example (Gait. Phase Database) Now, on the same plot, we place the top discord (transparent blue). As we can see, there is a unexpected “bump” that happens just after the foot falls. 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 10 0 Discords can detect very subtle anomalies. First consider the best motif in this dataset, to get an idea of the variability of a normal gait cycle… 500 20 40 60 80 100 120 140 This region is the source of the anomaly 0 20 40 60 80 100 120 140

Motif Example (mimic 2 wdb/35/3502218 ) Respiration motif 1: A deep breath forces the

Motif Example (mimic 2 wdb/35/3502218 ) Respiration motif 1: A deep breath forces the sensor up the hard limit of the machines max value, then at the half way point, the breath is relaxed Sometimes you find motifs at unexpected scales. motif 2: rhythmic breathing In this dataset, 250 seems about right based on visual inspection and our limited knowledge of the domain. Moreover, motif 1 and motif 2 do make a lot of sense. . 0 However, at ten times that length, there is an unexpectedly well conserved long motif…. . Short motifs 250 0 2500 An (unexpectedly) long motif

Discord Example The time series was extracted from a video of an actor performing

Discord Example The time series was extracted from a video of an actor performing various actions with and without a replica gun. The film strip below illustrates a typical sequence. The time series measures the Y coordinates of the actors right hand. The actor draws a replica gun from a hip mounted holster, aims it at a target, and returns it to the holster. The discord happens when the actor misses the holster when returning the gun. An off-camera (inaudible) remark is made, the actor looks toward the video technician, and convulses with laughter. At one point (frame 450), she is literally bent double with laughter. (ann_gun_Centroid. A) Motif Example Note that the best motif pair occurs at the end of this cycle, as the actor settles into a practiced rhythm. 200 400 600 800 1000 1200 1400 1600 1800 2000 20 15 10 5 0

Discord Example The top motif is a typical work week, starting from Tuesday (Italy

Discord Example The top motif is a typical work week, starting from Tuesday (Italy Power Demand 1995 to 1998) Motif Example Weekend 20 40 60 80 100 120 140 Note that the matrix profile is very low on average, most weeks are similar to the previous week (persistence) or the same week in a different year (history). All the high values can be explained by Italian holidays, most of which fall on different days in consecutive years. 0. 5 ay e o N D A 0. 5 1. 5 2 ry M of ion t mp to su os As rrag Fe ay rid y F a d oo r D r G aste p A E 14 Apr 16 0 1 y ar ts' ain ll S r ea Y w t as Xm 1 a y M ida of Fr ay n d io oo r D pt ay D sum osto r G aste r p o s E g b A rra 5 A pr La Fe 7 A ay M 1 1. 5 ay y ar rid y F M a f d oo r D n o G te tio ay ar Eas p D r m to M r su os bo 28 Ma As rrag La y 30 a Fe 1 M ar Ye w Ne to as m X 2. 5 2 2. 5 ar Ye to as Xm w Ne 160

Motif Example (“The Raven” in MFCC representation) 0 1000 2000 3000 4000 5000 6000

Motif Example (“The Raven” in MFCC representation) 0 1000 2000 3000 4000 5000 6000 7000 8000 If you are familiar with this classic poem, then by just looking at the low values in the Matrix Profile, you can immediately guess the repeated/rhyming text that produced the low values. …his shadow on the floor; …lies floating on the floor …bird above his chamber door…bust above his chamber door, …rapping at my chamber door. To the right, are two examples that jumped out to us. 0 120

Motif Example (Zebra Finch Vocalizations in MFCC, 100 day old male) 1000 2000 3000

Motif Example (Zebra Finch Vocalizations in MFCC, 100 day old male) 1000 2000 3000 4000 5000 6000 7000 8000 Motif discovery can often surprise you. While it is clear that this time series is not random, we did not expect the motifs to be so well conserved or repeated so many times. motif 1 motif 2 motif 3 0 2 seconds 200

[a] http: //futuredata. stanford. edu/ASAP/extended. pdf Taxi Example Given a long time series, where

[a] http: //futuredata. stanford. edu/ASAP/extended. pdf Taxi Example Given a long time series, where should you examine carefully? The problem is called “Attention Prioritization” [a], the matrix profile can be used for this. 500 1000 1500 2000 2500 3000 3500 The arrow points to the week of Thanksgiving (11/27), when the number of passengers dips. The (two day) matrix profile peaks there. 0 500 1000 1500 2000 2500