Processing Sequential Sensor Data John Krumm Microsoft Research

Interpret a Sequential Signal 1 -D Signal 120 100 80 60 40 20 0

Pervasive/Ubicomp Examples Signal sources • Accelerometer • Light sensor • Gyro sensor • Indoor

Goals of this Tutorial • Confidence to add sequential signal processing to your research

Not Covering 12000 Regression – fit function to data 10000 8000 6000 4000 2000

Outline • Introduction (already done!) • Signal terminology and assumptions • Running example •

Signal Dimensionality 1 -D Signal 120 100 80 60 40 20 0 1 D:

Sampled Signal Cannot measure nor store continuous signal, so take samples instead [ z(0),

Signal + Noise • Often assumed to be Gaussian • Often assumed to be

Running Example Actual Path and Measured Locations Track a moving person in (x, y)

Outline • Introduction • Signal terminology and assumptions • Running example • Filtering •

Mean Filter • Also called “moving average” and “box car filter” • Apply to

Mean Filter Actual Path and Measured Locations 90 90 80 80 70 70 outlier

Median Filtered version of this point is mean median of points in solid box

Median Filter Actual Path and Measured Locations 90 90 80 80 70 70 outlier

Mean and Median Filter 100 90 Mean Median 80 The median is almost always

Kalman Filter Assumed trajectory is parabolic • Mean and median filters assume smoothness •

Kalman Filter Kalman filter separates measured variables from state variables Measure: Infer state: Running

Kalman Filter Measurements Measurement vector is related to state vector by a matrix multiplication

Kalman Filter Dynamics Insert a bias for how we think system will change through

Kalman Filter Ingredients H matrix: gives measurements for given state Measurement noise: sensor noise

Kalman Filter Recipe Sleepy eyes threat level: red • Just plug in measurements and

Kalman Filter Velocity model: Kalman Filter 100 90 80 70 y (meters) • Smooth

Kalman Filter Velocity model: Kalman Filter 100 90 Untuned 80 Tuned 70 y (meters)

Kalman Filter The Kalman filter was fine back in the old days. But I

Particle Filter Dieter Fox et al. Wi. Fi tracking in a multi-floor building •

Particle Filter Dieter Fox et al. • Allows multi-modal uncertainty (Kalman is unimodal Gaussian)

Particle Filter Ingredients E. g. measured speed (in z) will be slower if emotional

Particle Filter Ingredients • Probabilistic dynamics, how state changes through time • Can be

Home Example Rich measurement and state dynamics models Measurements z = ( (x, y)

Particle Filter Algorithm Start with N instances of state vector xi(j) , i =

Particle Filter Running Example Particle Filter Measurement model reflects true, simulated measurement noise. Same

Particle Filter Resources Ubi. Comp 2004 Especially Chapter 1 37 37

Particle Filter The particle filter is wonderfully rich and expressive if you can afford

Hidden Markov Model (HMM) Big difference from previous: states are discrete, e. g. •

(Unhidden) Markov Model 0. 7 0. 9 0. 1 bus walk 0. 1 0.

Hidden Markov Model 0. 7 0. 9 0. 1 bus walk 0. 1 0.

HMM: Two Parts Two parts to every HMM: 1) Observation probabilities P(Xi(j)|zi) – probability

Smooth Results with HMM Signal Strength still moving still 70 60 50 40 30

Smooth Results with HMM 0. 99989 still 0. 4 still 0. 2 0. 00011

Running Example Discrete states are 10, 000 1 m x 1 m squares Observation

HMM Reference • Good description of Viterbi algorithm • Also how to learn model

Hidden Markov Model The HMM is great for certain applications when your states are

Presenting Continuous Performance Results meters Tracking Error vs. Filter Euclidian distance estimated value actual

Presenting Continuous Performance Results Cumulative error distribution • Shows how errors are distributed •

Presenting Discrete Performance Results Techniques like particle filter and HMM can classify sequential data

End Actual Path and Measured Locations 100 90 80 meters • Introduction • Signal

Ubiquitous Computing Fundamentals, CRC Press, © 2010 55

Slides: 54

Download presentation

Processing Sequential Sensor Data John Krumm Microsoft Research Redmond, Washington USA jckrumm@microsoft. com 1

Interpret a Sequential Signal 1 -D Signal 120 100 80 60 40 20 0 0 10 20 30 40 50 Time (seconds) 60 70 80 90 100 Signal is • Often a function of time (as above) • Often from a sensor 2 2

Pervasive/Ubicomp Examples Signal sources • Accelerometer • Light sensor • Gyro sensor • Indoor location • GPS • Microphone • … Interpretations • Speed • Mode of transportation • Location • Moving vs. not moving • Proximity to other people • Emotion • … 3 3

Goals of this Tutorial • Confidence to add sequential signal processing to your research • Ability to assess research with simple sequential signal processing • Know the terminology • Know the basic techniques • How to implement • Where they’re appropriate • Assess numerical results in an accepted way • At least give the appearance that you know what you’re talking about 4 4

Not Covering 12000 Regression – fit function to data 10000 8000 6000 4000 2000 0 0 Classification – classify things based on measured features 10 20 30 40 50 60 70 80 90 100 5 4 3 2 1 0 0 0. 5 1 1. 5 2 2. 5 3 100% Statistical Tests – determine if data support a hypothesis 80% 60% 40% 20% 0% 5 5

Outline • Introduction (already done!) • Signal terminology and assumptions • Running example • Filtering • Mean and median filters • Kalman filter • Particle filter • Hidden Markov model • Presenting performance results 6

Signal Dimensionality 1 -D Signal 120 100 80 60 40 20 0 1 D: z(t) 0 () 30 40 50 60 Time (seconds) 70 80 90 100 90 z 1(t) z 2(t) 20 2 -D Signal 80 70 z 2 (meters) 2 D: z(t) = 10 60 50 40 30 bold means vector 20 10 0 0 10 20 30 40 50 60 z 1 (meters) 70 80 90 100 7 7

Sampled Signal Cannot measure nor store continuous signal, so take samples instead [ z(0), z(Δ), z(2Δ), … , z((n-1)Δ) ] = [ z 1, z 2, z 3, … , zn ] Δ = sampling interval, e. g. 1 second, 5 minutes, … 1 -D Signal 120 100 80 60 40 20 0 0 0. 5 1 1. 5 2 2. 5 Time (seconds) 3 3. 5 4 4. 5 5 Δ = 0. 1 seconds 8 8

Signal + Noise • Often assumed to be Gaussian • Often assumed to be zero mean • Often assumed to be i. i. d. (independent, identically distributed) • vi ~ N(0, σ) for zero mean, Gaussian, i. i. d. , σ is standard deviation zi = x i + v i measurement from noisy sensor actual value, but unknown random number representing sensor noise 1 -D Signal 120 100 80 60 40 20 0 0 10 20 30 40 50 Time (seconds) 60 70 80 90 100 9 9

Running Example Actual Path and Measured Locations Track a moving person in (x, y) • 1000 (x, y) measurements • Δ = 1 second 100 90 80 70 actual location noise y (meters) measurement vector 60 50 40 30 20 10 zero mean 0 0 10 20 30 standard deviation = 3 meters Also 10 randomly inserted outliers with N(0, 15) start 40 50 x (meters) 60 70 80 90 100 outlier 10

Outline • Introduction • Signal terminology and assumptions • Running example • Filtering • Mean and median filters • Kalman filter • Particle filter • Hidden Markov model • Presenting performance results 11

Mean Filter • Also called “moving average” and “box car filter” • Apply to x and y measurements separately Filtered version of this point is mean of points in solid box zx t • “Causal” filter because it doesn’t look into future • Causes lag when values change sharply • Help fix with decaying weights, e. g. • Sensitive to outliers, i. e. one really bad point can cause mean to take on any value • Simple and effective (I will not vote to reject your paper if you use this technique) 12

Mean Filter Actual Path and Measured Locations 90 90 80 80 70 70 outlier 60 50 40 30 30 20 20 10 Mean Filter 100 y (meters) 100 10 0 0 10 20 30 40 50 60 x (meters) 70 80 90 100 10 points in each mean • Outlier has noticeable impact • If only there were some convenient way to fix this … 13 13

Median Filtered version of this point is mean median of points in solid box Insensitive to value of, e. g. , this point zx t Median is way less sensitive to outliners than median (1, 3, 4, 7, 1 x 1010) = 4 mean (1, 3, 4, 7, 1 x 1010) ≈ 2 x 109 14 14

Median Filter Actual Path and Measured Locations 90 90 80 80 70 70 outlier 60 50 40 30 30 20 20 10 Median Filter 100 y (meters) 100 10 0 0 10 20 30 40 50 60 x (meters) 70 80 90 100 10 points in each median Outlier has noticeable less impact 15 15

Mean and Median Filter 100 90 Mean Median 80 The median is almost always better to use than the mean. y (meters) 70 60 50 40 Editorial: mean vs. median 30 20 10 0 0 10 20 30 40 50 x (meters) 60 70 80 90 100 16 16

Kalman Filter Assumed trajectory is parabolic • Mean and median filters assume smoothness • Kalman filter adds assumption about trajectory My favorite book on Kalman filtering Weight data against assumptions about system’s dynamics cs dynami data Big difference #1: Kalman filter includes (helpful) assumptions about behavior of measured process 18

Kalman Filter Kalman filter separates measured variables from state variables Measure: Infer state: Running example: measure (x, y) coordinates (noisy) Running example: estimate location and velocity ( !) 100 90 80 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 100 Big difference #2: Kalman filter can include state variables that are not measured directly 19 19

Kalman Filter Measurements Measurement vector is related to state vector by a matrix multiplication plus noise. Running example: Sleepy eyes threat level: orange • In this case, measurements are just noisy copies of actual location • Makes sensor noise explicit, e. g. GPS has σ of around 5 meters 20 20

Kalman Filter Dynamics Insert a bias for how we think system will change through time location is standard straight-line motion velocity changes randomly (because we don’t have any idea what it actually does) 21 21

Kalman Filter Ingredients H matrix: gives measurements for given state Measurement noise: sensor noise φ matrix: gives time dynamics of state Process noise: uncertainty in dynamics model 22 22

Kalman Filter Recipe Sleepy eyes threat level: red • Just plug in measurements and go • Recursive filter – current time step uses state and error estimates from previous time step Big difference #3: Kalman filter gives uncertainty estimate in the form of a Gaussian covariance matrix 23 23

Kalman Filter Velocity model: Kalman Filter 100 90 80 70 y (meters) • Smooth • Tends to overshoot corners • Too much dependence on straight line velocity assumption • Too little dependence on data 60 50 40 30 20 dynami data cs 10 0 0 10 20 30 40 50 x (meters) 60 70 80 90 100 25 25

Kalman Filter Velocity model: Kalman Filter 100 90 Untuned 80 Tuned 70 y (meters) • Hard to pick process noise σs • Process noise models our uncertainty in system dynamics • Here it accounts for fact that motion is not a straight line 60 50 40 30 “Tuning” σs (by trying a bunch of values) gives better result 20 10 0 0 10 20 30 40 50 x (meters) 60 70 80 90 100 26 26

Kalman Filter The Kalman filter was fine back in the old days. But I really prefer more modern methods that are not saddled with Kalman’s restrictions on continuous state variables and linearity assumptions. Editorial: Kalman filter 27 27

Particle Filter Dieter Fox et al. Wi. Fi tracking in a multi-floor building • Multiple “particles” as hypotheses • Particles move based on probabilistic motion model • Particles live or die based on how well they match sensor data 29 29

Particle Filter Dieter Fox et al. • Allows multi-modal uncertainty (Kalman is unimodal Gaussian) • Allows continuous and discrete state variables (e. g. 3 rd floor) • Allows rich dynamic model (e. g. must follow floor plan) • Can be slow, especially if state vector dimension is too large (e. g. (x, y, identity, activity, next activity, emotional state, …) ) 30 30

Particle Filter Ingredients E. g. measured speed (in z) will be slower if emotional state (in x) is “tired” • z = measurement, x = state, not necessarily same • Probability distribution of a measurement given actual value • Can be anything, not just Gaussian like Kalman • But we use Gaussian for running example, just like Kalman For running example, measurement is noisy version of actual value 31 31

Particle Filter Ingredients • Probabilistic dynamics, how state changes through time • Can be anything, e. g. • Tend to go slower up hills • Avoid left turns • Attracted to Scandinavian people • Closed form not necessary • Just need a dynamic simulation with a noise component • But we use Gaussian for running example, just like Kalman xi random vector xi-1 32 32

Home Example Rich measurement and state dynamics models Measurements z = ( (x, y) location in house from Wi. Fi)T State (what we want to estimate) x = (room, activity) • p((x, y) in kitchen | in bathroom) = 0 • p( sleeping now | sleeping previously) = 0. 9 • p( cooking now | working previously) = 0. 02 • p( watching TV & sleeping| *) = 0 • p( bedroom 4 | master bedroom) = 0 33 33

Particle Filter Algorithm Start with N instances of state vector xi(j) , i = 0, j = 1 … N 1. i = i+1 2. Take new measurement zi 3. Propagate particles forward in time with p(xi|xi-1), i. e. generate new, random hypotheses 4. Compute importance weights wi(j) = p(zi|xi(j)), i. e. how well does measurement support hypothesis? 5. Normalize importance weights so they sum to 1. 0 6. Randomly pick new particles based on importance weights 7. Goto 1 Compute state estimate • Weighted mean (assumes unimodal) • Median Sleepy eyes threat level: orange 34

Particle Filter Running Example Particle Filter Measurement model reflects true, simulated measurement noise. Same as Kalman in this case. 100 90 80 y (meters) 70 Straight line motion with random velocity change. Same as Kalman in this case. location is standard straight-line motion Actual 60 Particle 1000000 50 40 30 20 10 0 0 velocity changes randomly (because we don’t have any idea what it actually does) 10 20 30 40 50 x (meters) 60 70 80 90 100 Sometimes increasing the number of particles helps 36 36

Particle Filter Resources Ubi. Comp 2004 Especially Chapter 1 37 37

Particle Filter The particle filter is wonderfully rich and expressive if you can afford the computations. Be careful not to let your state vector get too large. Editorial: Particle filter 38 38

Hidden Markov Model (HMM) Big difference from previous: states are discrete, e. g. • Spoken phoneme • {walking, driving, biking, riding bus} • {moving, still} • {cooking, sleeping, watching TV, playing game, … } Markov 1856 - 1922 Hidden Markov 40 40

(Unhidden) Markov Model 0. 7 0. 9 0. 1 bus walk 0. 1 0. 0 0. 2 0. 0 • Move to new state (or not) • at every time click • when finished with current state • Transition probabilities control state transitions 0. 1 drive 0. 9 Example inspired by: 41 Ubi. Comp 2003 41

Hidden Markov Model 0. 7 0. 9 0. 1 bus walk 0. 1 0. 0 Can “see” states only via noisy sensor 0. 2 0. 0 0. 1 drive accelerometer 0. 9 42 42

HMM: Two Parts Two parts to every HMM: 1) Observation probabilities P(Xi(j)|zi) – probability of state j given measurement at time i 2) Transition probabilities ajk – probability of transition from state j to state k • Find path that maximizes product of probabilities (observation & transition) • Use Viterbi algorithm to find path efficiently 43 43

Smooth Results with HMM Signal Strength still moving still 70 60 50 40 30 20 10 0 moving vs. still 0 10 20 30 40 50 Time (sec. ) 60 70 80 90 100 0. 00011 0. 99989 still 0. 99989 moving 0. 00011 Signal strength has higher variance when moving → observation probabilities Transitions between states relatively rare (made-up numbers) → transition probabilities 44 44

Smooth Results with HMM 0. 99989 still 0. 4 still 0. 2 0. 00011 still 0. 9 0. 3 0. 00011 moving 0. 99989 0. 6 moving 0. 8 moving 0. 1 Viterbi algorithm finds path with maximum product of observation and transition probabilities 0. 7 80 60 40 20 0 0 20 40 60 80 100 Results in fewer false transitions between states, i. e. smoother and slightly more accurate 45 45

Running Example Discrete states are 10, 000 1 m x 1 m squares Observation probabilities spread in Gaussian over nearby squares as per measurement noise model Transition probabilities go to 8 -connected neighbors 46 46

HMM Reference • Good description of Viterbi algorithm • Also how to learn model from data 47 47

Hidden Markov Model The HMM is great for certain applications when your states are discrete. Editorial: Hidden Markov Model Tracking in (x, y, z) with HMM? • Huge state space (→ slow) • Long dwells • Interactions with other airplanes 48 48

Presenting Continuous Performance Results meters Tracking Error vs. Filter Euclidian distance estimated value actual value 50 45 40 35 30 25 20 15 10 5 0 Mean Error Median Error Measured Plot mean or median of Euclidian distance error • Median is less sensitive to error outliers Mean Median Kalman (untuned) (tuned) Particle HMM Tracking Error vs. Filter 7 meters 6 5 Mean Error 4 Median Error 3 2 Note: Don’t judge these filtering methods based on these plots. I didn’t spend much time tuning the methods to improve their performance. 1 0 Mean Median Kalman (untuned) Kalman (tuned) Particle HMM 50 50

Presenting Continuous Performance Results Cumulative error distribution • Shows how errors are distributed • More detailed than just a mean or median error 0. 9 0. 8 0. 7 Fraction 95 th percentile Cumulative Error Distribution 1 0. 6 Median 0. 5 HMM 0. 4 Kalman (tuned) 0. 3 Particle 0. 2 Mean 0. 1 Kalman (untuned) 0 0 95% of the time, the particle filter gives an error of 6 meters or less (95 th percentile error) 1 2 3 4 5 6 Error (meters) 7 8 9 50% of the time, the particle filter gives an error of 2 meters or less (median error) 10 51 51

Presenting Discrete Performance Results Techniques like particle filter and HMM can classify sequential data into discrete classes Confusion matrix Pervasive 2006 52 52

End Actual Path and Measured Locations 100 90 80 meters • Introduction • Signal terminology and assumptions • Running example • Filtering • Mean and median filters • Kalman filter • Particle filter • Hidden Markov model • Presenting performance results 7 6 5 4 3 2 1 0 y (meters) 70 60 50 40 30 20 10 0 0 10 20 30 40 50 60 x (meters) 70 80 90 100 Tracking Error vs. Filter Mean Error Median Error Mean Median Kalman (untuned) Kalman (tuned) Particle HMM 53 53