Smart PhoneBased Sensor Mining www cis fordham eduwisdm
Smart Phone-Based Sensor Mining www. cis. fordham. edu/wisdm or wisdmproject. com Gary M. Weiss Comp & Info Science Dept Fordham University gweiss@cis. fordham. edu
What is Smart Phone Sensor Mining? �Data Mining: Extraction of knowledge from data via automated methods �Smart phone sensor mining: Extraction of useful knowledge from the data generated by smart phone sensors 1/11/2012 Gary M. Weiss ICCS 2012 2
Smart Phone Sensors �What sensors are found on smart phones? Audio sensor (microphone) Image sensor (camera, video recorder) Tri-Axial Accelerometer Location sensor (GPS, cell tower, Wi. Fi) Infrared proximity sensor; Light sensor Magnetic compass; Temperature sensor; Touch sensor Virtual/calculated sensors: ▪ Proximity (via light), gravity, orientation, gyroscope 1/11/2012 Gary M. Weiss ICCS 2012 3
How Does this Topic Relate to ICCS? �Learning about smart phone users Security requires understanding how devices used Main focus of talk not on security but on what can be learned about smart phone users �Smart phone based biometric identification Can be considered a security application �Many news stories about abuses Apps to spy on your spouse; i. Phone location fiasco 1/11/2012 Gary M. Weiss ICCS 2012 4
WISDM Research Areas �Activity recognition (what are you doing)? Are you walking, jogging, sitting, standing, etc? �Biometric Identification (who are you)? Are you John Smith? �Trait Identification (who are you at diff. level)? Are you male? Are you tall? What do you weigh? 1/11/2012 Gary M. Weiss ICCS 2012 5
Why Learn Everything About You? �Data miners want to learn everything about you Somehow that info will be useful Develop useful apps, marketing leads, etc. Many positive uses ▪ That is why NSF provided WISDM with funding for activity recognition from “Health and Well Being” program But obviously issues with privacy and abuse 1/11/2012 Gary M. Weiss ICCS 2012 6
Data Mining: Basic Approach � Approach to Predictive Data Mining 1. Collect labeled (sensor) training data 2. Apply data mining method to build predictive model 3. Apply predictive model to future unlabelled data 1/11/2012 Gary M. Weiss ICCS 2012 7
Activity Recognition 1/11/2012 Gary M. Weiss ICCS 2012 8
Activity Recognition �Why is it useful? Context-sensitive applications ▪ Context influences handling of phone calls or music to play Health applications ▪ Track activity levels or detect falls in elderly �Approaches to activity recognition Uses multiple accelerometers Use custom devices (pedometer, Fit. Bit) Our approach: use existing smart phones 1/11/2012 Gary M. Weiss ICCS 2012 9
Sample Accelerometer Data �Accelerometer data from Android phone Walking Jogging Climbing Stairs Lying Down Sitting Standing Gravity included 1/11/2012 Gary M. Weiss ICCS 2012 10
Accelerometer Data for “Walking” 1/11/2012 Gary M. Weiss ICCS 2012 11
Accelerometer Data for “Jogging” 1/11/2012 Gary M. Weiss ICCS 2012 12
Accelerometer Data for “Up Stairs” 1/11/2012 Gary M. Weiss ICCS 2012 13
Accelerometer Data for “Standing” 1/11/2012 Gary M. Weiss ICCS 2012 14
Activity Recognition Results: Impersonal (Universal) Model Single Model trained and used for everyone Data Mining Method: Instance Based Learning (WEKA IB 3) Actual Class 72. 4% Accuracy 1/11/2012 Predicted Class Walking Jogging Stairs Sitting Standing Lying Down Walking 2209 46 789 2 4 0 Jogging 45 1656 148 1 0 0 Stairs 412 54 869 3 1 0 Sitting 10 0 47 553 30 241 Standing 8 0 57 6 448 3 Lying Down 5 1 7 301 13 131 Gary M. Weiss ICCS 2012 15
Activity Recognition Results: Personal Model: Model Build per User Data Mining Method: Instance Based Learning (WEKA IB 3) 98. 4% accuracy Predicted Class Jogging Stairs Walking 3033 1 24 0 0 Lying Down 0 Jogging 4 1788 4 0 0 0 Stairs 42 4 1292 1 0 0 Sitting 0 0 4 870 2 6 Standing 5 0 11 1 509 0 Lying Down 4 0 8 7 0 442 Actual Class Walking 1/11/2012 Gary M. Weiss ICCS 2012 Sitting Standing 16
Biometric Identification 1/11/2012 Gary M. Weiss ICCS 2012 17
Biometric Identification � Identification based on physical/behavioral traits Fingerprints, DNA, iris, gait, etc. �Biometrics for everyone Equipment smaller & cheaper (sensors + processing) ▪ Laptops currently perform face recognition �Gait-based recognition Most work is camera-based �Some applications device security, customization & personalization 1/11/2012 Gary M. Weiss ICCS 2012 18
WISDM Biometrics �Used for identification and authentication Identification means predicting identity from pool of users (36 in initial study and 200 in recent study) Authentication is a binary class prediction ▪ Is it you or an imposter? �We evaluate walking and other activities as well as unclassified activities �Predictions made on individual 10 sec. samples but also combine “votes” to exploit larger samples 1/11/2012 Gary M. Weiss ICCS 2012 19
WISDM Biometric Prediction Results Unclassified Walk Jog Up Down J 48 72. 2 84. 0 83. 0 65. 8 61. 0 Neural Net 69. 5 90. 9 92. 2 63. 3 54. 5 Straw Man 4. 3 4. 2 5. 0 6. 5 4. 7 Based on 10 second test samples Unclassified Walk Jog Up Down J 48 36/36 31/32 31/31 28/31 Neural Net 36/36 32/32 28. 5/31 25/31 Based on most frequent prediction for 5 -10 minutes of data Authentication results even better (~90% with 10 sec samples) Recent unpublished results demonstrate 100% accuracy with 200 users! 1/11/2012 Gary M. Weiss ICCS 2012 20
Trait Identification 1/11/2012 Gary M. Weiss ICCS 2012 21
Trait Identification Applications �Soft biometrics: traits can aid with biometrics �As data miners we want to know everything about a person Marketing applications: ads based on sex Inferred weight to predict calories burned 1/11/2012 Gary M. Weiss ICCS 2012 22
Expanding the Definition of Trait �Normally think about traits as being: Unchanging: race, skin color, eye color, etc. Slow changing: Height, weight, etc. �But want to know everything about a person: What they wear, how they feel, if they are tired, etc. Have never seen this goal for mobile sensor mining 1/11/2012 Gary M. Weiss ICCS 2012 23
WISDM Trait Identification �Work in early stages �Data initially collected from ~70 people, now 200 Accelerometer and survey data Survey data includes anything we could think of that might somehow be predictable ▪ Sex, height, weight, age, race, handedness, disability ▪ Shoe size, footwear type, size of heels, type of clothing ▪ # hours academic work , # hours exercise Too few subjects investigate all factors ▪ Many were not predictable (maybe with more data) 1/11/2012 Gary M. Weiss ICCS 2012 24
WISDM Trait Identification Results Accuracy Male Female 71. 2% Male 31 7 Female 12 16 Accuracy Short 83. 3% Tall Accuracy 78. 9% Short Tall 5 20 Light Heavy 15 2 Light Heavy 13 2 7 17 Results for IB 3 classifier. For height and weight middle categories removed. 1/11/2012 Gary M. Weiss ICCS 2012 25
Security & Privacy 1/11/2012 Gary M. Weiss ICCS 2012 26
Security and Privacy �Security policies vary widely by OS & platform Symbian requires properly signed keys to remove restrictions on using certain APIs i. Phone apps have relatively strict oversight Android OS has few restrictions and Marketplace has essentially no oversight or restrictions ▪ WISDM project has had no problem tapping into sensors and transmitting results. Just pay $25 for account. 1/11/2012 Gary M. Weiss ICCS 2012 27
Android Notifications �Android notifies user of services SYSTEM PERMISSIONS FOR WISDM Sensor. Collector ▪ Coarse location, fine location, internet access, keep from sleeping, modify/delete USB storage �Applications routinely access sensitive services Fandango : fine GPS location, read phone state & identity, modify/delete USB storage, internet access Angry Birds: identical permissions! Notifications probably next to useless given this! 1/11/2012 Gary M. Weiss ICCS 2012 28
Security and Privacy �Even legitimate applications have to be concerned with privacy & security WISDM will encrypt data in transit, encrypt on phone, include secure accounts & passwords, etc. Need to ensure than any aggregated info is made public only if cannot be traced to individual 1/11/2012 Gary M. Weiss ICCS 2012 29
Security and Privacy �Good Policies: Make it clear what you are monitoring and storing Provide application level control for the user ▪ Allow user to turn on/off monitoring of specific sensors ▪ If they use an option to upload the information to Facebook then little privacy! �Since legitimate and illegitimate apps function alike, no easy way to distinguish them Could try to use only certified apps, but quite limiting 1/11/2012 Gary M. Weiss ICCS 2012 30
Available Soon: Actitracker �WISDM is building & deploying the actitracker service to track your activities real-time and display them via a web-based interface Useful health information and thus supported by NSF Grant & Google faculty research award Actitracker. com online and should have basic functionality shortly 1/11/2012 Gary M. Weiss ICCS 2012 31
Special Thanks To … �WISDM research group Current Members ▪ Anthony Alcaro, Alex Armero, Shaun Gallagher, Andrew Grosner, Margo Flynn, Jeff Lockhart, Paul Mc. Hugh, Luigi Patruno, Tony Pulickal, Greg Rivas, Priscilla Twum, Bethany Wolff, Zach Wyhowanec, Jack Xue Key Former Members ▪ Jennifer Kwapisz, Sam Moore, Shane Skowron, Alvan Wong Funders: NSF, Google, and Fordham 1/11/2012 Gary M. Weiss ICCS 2012 32
WISDM References 1. J. R. Kwapisz, G. M. Weiss, and S. A. Moore. 2010. Activity recognition using cell phone accelerometers, in Proceedings of the Fourth International Workshop on Knowledge Discovery from Sensor Data, 10 -18. 2. J. R. Kwapisz, G. M. Weiss, and S. A. Moore, 2010. Cell phone-based biometric identification, in Proceedings of the IEEE Fourth International Conference on Biometrics: Theory, Applications and Systems. 3. J. W. Lockhart, G. M. Weiss, J. C. Xue, S. T. Gallagher, A. B. Grosner, T. T. Pulickal. 2011. Design considerations for the WISDM smart phone-based sensor mining architecture, in Proceedings of the Fifth International Workshop on Knowledge Discovery from Sensor Data, San Diego, CA. 4. G. M. Weiss, and J. W. Lockhart, 2011. Identifying user traits by mining smart phone accelerometer data, in Proceedings of the 5 th International Workshop on Knowledge Discovery from Sensor Data. , San Diego, CA. 1/11/2012 Gary M. Weiss ICCS 2012 33
Thank you For more information go to wisdmproject. com 1/11/2012 Gary M. Weiss ICCS 2012 Gary Weiss gweiss@cis. fordham. edu 34
- Slides: 34