Smart PhoneBased Sensor Mining Gary M Weiss Fordham
Smart Phone-Based Sensor Mining Gary M. Weiss Fordham University gweiss@cis. fordham. edu
Background and Motivation �Smart phones are ubiquitous As of 4 th quarter 2010 outpaced PC sales We carry them everywhere at almost all times �Smart phones are powerful Increasing processing power and storage space Filled with sensors �Smart phones include the following sensors: ▪ Tri-Axial Accelerometer ▪ Location sensor (GPS, cell tower, Wi. Fi) ▪ Audio sensor (microphone), Image sensor (camera) ▪ Proximity, light, temperature, magnetic compass 5/17/2012 Gary M. Weiss Einstein 2
Data Mining and Sensor Mining � Data mining: application of computational methods to extract knowledge from data Most data mining involves inferring predictive models, often for classification � Sensor mining: application of computational methods to extract knowledge from sensor data � Supervised machine learning Obtain labeled time-series training data Create examples described by generated features Build model to predict example’s label 5/17/2012 Gary M. Weiss Einstein 3
The WISDM Project �Three years ago started what is now WISDM Began with focus on activity recognition ▪ Determine what a user is doing based on accelerometer Moved to an Android-based smartphone platform Expanded to other applications ▪ Biometric identification ▪ Identifying user characteristics (soft biometrics) ▪ Mining GPS data (project starting with Bronx Zoo) Current focus on Actitracker ▪ Track user activities and present info to user via the web as a health app (NSF “Health and Well-Being Grant) 5/17/2012 Gary M. Weiss Einstein 4
The WISDM Platform �Based on Android Smartphones but could be extended to other mobile devices �Client/Server architecture Smartphones are the client (they run our app) We have a dedicated server Right now raw data is sent to the server and processing occurs there Data can be streamed or sent on demand In future more responsibility moved to the phone 5/17/2012 Gary M. Weiss Einstein 5
WISDM Platform Continued �Web Interface Users can access their data via a web interface ▪ Accessible from smartphone or full-screen computer �Security Secure logins and data encrypted �Resource Issues: Power is an issue if collect GPS data and maybe if we collect data 24 x 7, but not for periodic data collection 5/17/2012 Gary M. Weiss Einstein 6
Smart Phone Accelerometer �Measures acceleration along 3 spatial axes �Detects/measures gravity (orientation matters) �Measurement range typically -2 g to +2 g Okay for most activities but falling yields higher values Range & sensitivity may be adjustable �Sampling rates ~20 -50 Hz Study found 20 Hz required for activity recognition WISDM project found could not reliably sample beyond 20 Hz (50 ms) and this may impact activity recognition 5/17/2012 Gary M. Weiss Einstein 7
Existing WISDM Applications �Activity Recognition Identify the activity a user is performing (walking, jogging, sitting, etc. ) �Biometric Identification Identify a user based on prior accelerometer data collected from that user �Trait Identification Identify characteristics about a user based (height, weight, age) 5/17/2012 Gary M. Weiss Einstein 8
Why is Activity Recognition Useful? �Context-sensitive applications Handle phone calls differently depending on context Play music to suit your activity New & innovative apps to make phones smarter �Tracking & Health applications Track overall activity levels & generate fitness profiles Care of elderly ▪ Detect dangerous situations like (falling) ▪ Warn if some with Alzheimer’s wanders outside of area 5/17/2012 Gary M. Weiss Einstein 9
Accelerometer Data for Six Activites �Accelerometer data from Android phone Walking Jogging Climbing Stairs Lying Down Sitting Standing 5/17/2012 Gary M. Weiss Einstein 10
Accelerometer Data for “Walking” 5/17/2012 Gary M. Weiss Einstein 11
Accelerometer Data for “Jogging” 5/17/2012 Gary M. Weiss Einstein 12
Accelerometer Data for “Up Stairs” 5/17/2012 Gary M. Weiss Einstein 13
Accelerometer Data for “Lying Down” 5/17/2012 Gary M. Weiss Einstein 14
Accelerometer Data for “Sitting” Z axis 5/17/2012 Gary M. Weiss Einstein 15
Accelerometer Data for “Standing” 5/17/2012 Gary M. Weiss Einstein 16
WISDM Activity Recognition �Six activities: walking, jogging, stairs, sitting, standing, lying down �Labeled data collected from over 50 users �Data transformed via 10 -second windows Accelerometer data sampled (x, y, z) every 50 ms Features (per axis): ▪ average, SD, ave diff from mean, ave resultant accel, binned distribution, time between peaks 5/17/2012 Gary M. Weiss Einstein 17
WISDM Activity Recognition �The 43 features used to build a classifier WEKA data mining suite used, multiple techniques Personal, universal, hybrid models built �Architecture (for now) uses “dumb” client �Basis of soon to be released actitracker service Provides web based view of activities over time 5/17/2012 Gary M. Weiss Einstein 18
WISDM Results �WISDM Results are shown for various things Personal, universal, and hybrid models Most results aggregated over all users but a few per user to show performance varies by user Results for 6 activities (ones shown in the plots) 5/17/2012 Gary M. Weiss Einstein 19
WISDM Universal Model- IB 3 Matrix Actual Class 72. 4% Accuracy 5/17/2012 Predicted Class Walking Jogging Stairs Sitting Standing Lying Down Walking 2209 46 789 2 4 0 Jogging 45 1656 148 1 0 0 Stairs 412 54 869 3 1 0 Sitting 10 0 47 553 30 241 Standing 8 0 57 6 448 3 Lying Down 5 1 7 301 13 131 Gary M. Weiss Einstein 20
WISDM Personal Model- IB 3 Matrix 98. 4% accuracy Predicted Class Jogging Stairs Walking 3033 1 24 0 0 Lying Down 0 Jogging 4 1788 4 0 0 0 Stairs 42 4 1292 1 0 0 Sitting 0 0 4 870 2 6 Standing 5 0 11 1 509 0 Lying Down 4 0 8 7 0 442 Actual Class Walking 5/17/2012 Gary M. Weiss Einstein Sitting Standing 21
WISDM Accuracy Results % of Records Correctly Classified Personal Universal Straw IB 3 J 48 NN Man Walking 99. 2 97. 5 99. 1 72. 4 77. 3 60. 6 37. 7 Jogging 99. 6 98. 9 99. 9 89. 5 89. 7 89. 9 22. 8 Stairs 96. 5 91. 7 98. 0 64. 9 56. 7 67. 6 16. 5 Sitting 98. 6 97. 7 62. 8 78. 0 67. 6 10. 9 Standing 96. 8 96. 4 97. 3 85. 8 92. 0 93. 6 6. 4 Lying Down 95. 9 95. 0 96. 9 28. 6 26. 2 60. 7 5. 7 71. 2 37. 7 Overall 5/17/2012 98. 4 96. 6 98. 7 72. 4 74. 9 Gary M. Weiss Einstein 22
Biometric Identification 5/17/2012 Gary M. Weiss Einstein 23
Biometrics �Biometrics concerns unique identification based on physical or behavioral traits Hard biometrics involves traits that are sufficient to uniquely identify a person ▪ Fingerprints, DNA, iris, etc. Soft biometric traits are not sufficiently distinctive, but may help ▪ Physical traits: Sex, age, height, weight, etc. ▪ Behavioral traits: gait, clothes, travel patterns, etc. 5/17/2012 Gary M. Weiss Einstein 24
Gait-Based Biometrics �Numerous accelerometer-based systems that use dedicated and/or multiple sensors See related work section of Cell Phone-Based Biometric Identification for details �Possible uses: ▪ ▪ 5/17/2012 Phone security (e. g. , to automatically unlock phone) Automatic device customization To better track people for shared devices Perhaps for secondary level of physical security Gary M. Weiss Einstein 25
WISDM Biometric System �Same setup as WISDM activity recognition Same data collection, feature extraction, WEKA, … �Used for identification and authentication Identification: predicting identity from pool of users Authentication is binary class prediction problem �Evaluate single and mixed activities Evaluate using 10 sec. and several min. of test data ▪ Longer sample classify with “Most Frequent Prediction” �Results based on 36 users But hold up on preliminary experiments with 200 users 5/17/2012 Gary M. Weiss Einstein 26
WISDM Biometric Prediction Results Aggregate Walk Jog Up Down Aggregate (Oracle) J 48 72. 2 84. 0 83. 0 65. 8 61. 0 76. 1 Neural Net 69. 5 90. 9 92. 2 63. 3 54. 5 78. 6 Straw Man 4. 3 4. 2 5. 0 6. 5 4. 7 4. 3 Based on 10 second test samples Aggregate Walk Jog Up Down Aggregate (Oracle) J 48 36/36 31/32 31/31 28/31 36/36 Neural Net 36/36 32/32 28. 5/31 25/31 36/36 Based on most frequent prediction for 5 -10 minutes of data 5/17/2012 Gary M. Weiss Einstein 27
WISDM Biometric Authentication Results �Authentication results: Positive authentication of a user ▪ 10 second sample: ~85% ▪ Most frequent class over 5 -10 min: 100% Negative Authentication of a user (an imposter) ▪ 10 second sample: ~96% ▪ Most frequent class over 5 -10 min: 100% 5/17/2012 Gary M. Weiss Einstein 28
Biometric Identification Summary �Can do remarkably well with short amounts of accelerometer data (10 s – 2 min) �Since we can distinguish between ways different people walk may be able to distinguish between different gaits 5/17/2012 Gary M. Weiss Einstein 29
Trait Identification 5/17/2012 Gary M. Weiss Einstein 30
WISDM Trait Identification �Data collected from ~70 people (now over 200) Accelerometer and survey data Survey data includes anything we could think of that might somehow be predictable ▪ ▪ Sex, height, weight, age, race, handedness, disability Type of area grew up in {rural, suburban, urban} Shoe size, footwear type, size of heels, type of clothing # hours academic work , # hours exercise Too few subjects investigate all factors ▪ Many were not predictable (maybe with more data) 5/17/2012 Gary M. Weiss Einstein 31
WISDM Trait Identification Results Accuracy Male Female 71. 2% Male 31 7 Female 12 16 Accuracy Short 83. 3% Tall Accuracy 78. 9% Short Tall 5 20 Light Heavy 15 2 Light Heavy 13 2 7 17 Results for IB 3 classifier. For height and weight middle categories removed. 5/17/2012 Gary M. Weiss Einstein 32
Trait Identification Summary �A wide open area for data mining research A marketers dream �Clear privacy issues �Room for creativity & insight for finding traits �Probably many interesting commercial and research applications Imagine diagnosing back problems via your mobile phone via gait analysis … 5/17/2012 Gary M. Weiss Einstein 33
Connections to Your Work �Can collect accelerometer data from patients On demand or in the background Data transmitted wirelessly or stored on the phone for periodic download �Can extend study beyond gait Can monitor overall activity levels Can monitor daily routine 5/17/2012 Gary M. Weiss Einstein 34
Connections to Your Work cont. �Facilitate quantitative analysis of gait “Fourth, although experienced clinicians assessed gait, quantitative analysis of gait might be more reliable” (Verghese et al. 2002) Accelerometer data can provide basis for gait classification Can use data mining to learn a classifier for gait ▪ Just need carefully selected training data ▪ Yields consistent measure 5/17/2012 Gary M. Weiss Einstein 35
Connections to Your Work cont. �Can look at other neurological diseases besides non-Alzheimer’s dementia �Can try to track progression of Alzheimer’s �Note can monitor daily routine, travel, etc. �Smartphone can also administer surveys, record video, provide voice prompts, etc. �Besides diagnosis, can assist people suffering from these diseases 5/17/2012 Gary M. Weiss Einstein 36
My Contact Information �Gary Weiss Fordham University, Bronx NY 10458 gweiss@cis. fordham. edu http: //storm. cis. fordham. edu/~gweiss/ �WISDM Information http: //www. cis. fordham. edu/wisdm/ ▪ WISDM papers available: click “About” then “Publications” By end of summer Actitracker will allow you to track your activities via our Android app (actitracker. com) 5/17/2012 Gary M. Weiss Einstein 37
WISDM Members �WISDM research group Current Active Members ▪ Linna AI*, Shaun Gallagher*, Andrew Grosner*, Margo Flynn, Jeff Lockhart*, Paul Mc. Hugh*, Tony Pulickal*, Greg Rivas*, Isaac Ronan*, Priscilla Twum, Bethany Wolff * Working full-time on the project at Fordham over the summer 5/17/2012 Gary M. Weiss Einstein 38
References Available from: http: //www. cis. fordham. edu/wisdm/publications Kwapisz, J. R. , Weiss, G. M. , and Moore, S. A. 2010. Activity recognition using cell phone accelerometers, Proceedings of the Fourth International Workshop on Knowledge Discovery from Sensor Data, 10 -18. Kwapisz, J. R. , Weiss, G. M. , and Moore, S. A. 2010. Cell phone-based biometric identification, Proceedings of the IEEE Fourth International Conference on Biometrics: Theory, Applications and Systems. Lockhart, J. W. , Weiss, G. M. , Xue, J. C. , Gallagher, S. T. , Grosner, A. B. , and Pulickal, T. T. 2011. Design considerations for the WISDM smart phone-based sensor mining architecture, In Proceedings of the Fifth International Workshop on Knowledge Discovery from Sensor Data, San Diego, CA. Weiss, G. M. , and Lockhart, J. W. 2011. Identifying user traits by mining smart phone accelerometer data, Proceedings of the 5 th International Workshop on Knowledge Discovery from Sensor Data. Weiss, G. M. , and Jeffrey W. Lockhart (2012). The Impact of Personalization on Smartphone-Based Activity Recognition, Proceedings of the AAAI-12 Workshop on Activity Context Representation: Techniques and Languages, Toronto, CA. 5/17/2012 Gary M. Weiss Einstein 39
- Slides: 39