Privacypreserving of Trajectory Data A Survey Huo Zheng

  • Slides: 46
Download presentation
Privacy-preserving of Trajectory Data : A Survey Huo Zheng

Privacy-preserving of Trajectory Data : A Survey Huo Zheng

OUTLINE • Motivating Applications • Privacy-preserving in Different Scenarios • Conclusions & Future work

OUTLINE • Motivating Applications • Privacy-preserving in Different Scenarios • Conclusions & Future work

Motivating Applications 1. Trajectory data publication & analysis 3. ITS 2. LBS 4. Trajectory

Motivating Applications 1. Trajectory data publication & analysis 3. ITS 2. LBS 4. Trajectory data outsourcing

OUTLINE • Motivating Applications • Privacy-preserving in Different Scenarios • Conclusions & Future work

OUTLINE • Motivating Applications • Privacy-preserving in Different Scenarios • Conclusions & Future work

Solutions-overview Data Publication Suppression Data Outsourcing ITS [Terrovitis MDM’ 08] LBS [Gruteser ISE’ 04]

Solutions-overview Data Publication Suppression Data Outsourcing ITS [Terrovitis MDM’ 08] LBS [Gruteser ISE’ 04] [Abul ICDMW’ 07] [Ghinita TDP’ 09] Anonimizatio n [Nergiz TDP’ 09] [Hoh Mobi. Sys’ 08] [Abul ICDE’ 08] [Xu INFOCOM’ 08] [Gidofalvi MDM’ 07] [Divanis SIAM’ 09] Perturbation Encryption [Hoh Secure. Com’ 05] [Lee CIKM’ 09] [ You PALSM’ 07] [Xu Proposal’ 10]

Scenario #1 Trajectory data publication & analysis ITS Trajectory data outsourcing LBS

Scenario #1 Trajectory data publication & analysis ITS Trajectory data outsourcing LBS

Solutions #1 Overview • Protecting trajectory data privacy against attackers in the following aspects:

Solutions #1 Overview • Protecting trajectory data privacy against attackers in the following aspects: ▫ Protecting trajectory data to be identified by the adversary ▫ Protecting sensitive location samples in trajectory data. ▫ Attackers may have background knowledge to induce users’ information,For example, home and work place can help adversary to infer the trajectory’s owner ▫ Protect data privacy while preserving the utility of data Data Privacy Data Utility

 • Basic Idea Dummies ▫ Increasing the number of possible trajectories from the

• Basic Idea Dummies ▫ Increasing the number of possible trajectories from the adversaries’ perspective ▫ Decreasing disclosure of the user trajectory • Method ▫ Generate dummy trajectories as human behavior ▫ Generate dummy trajectories with distances larger than a predefined distance deviation [You PALMS’ 07]

Dummies (cont’) • Procedure ▫ Set a disclosure rate ▫ Generate dummies Source destination

Dummies (cont’) • Procedure ▫ Set a disclosure rate ▫ Generate dummies Source destination �Random �Trajectories with intersections �Rotate �Compute distance deviation [You PALMS’ 07]

Pros and cons • Pros ▫ Attackers can’t distinguish which trajectory is real user

Pros and cons • Pros ▫ Attackers can’t distinguish which trajectory is real user trajectory under a threshold which is given by users ▫ Simple, easy to understand • Cons ▫ High cost in storage, for example, to protect a single trajectory, you need to store several dummy trajectories, causing lower data utility. ▫ High disclosure rate for adversaries with strong background knowledge [You PALMS’ 07]

Suppress locations in trajectory data publication • Basic Idea ▫ Suppress location samples in

Suppress locations in trajectory data publication • Basic Idea ▫ Suppress location samples in a trajectory database • Procedure ▫ Decide which location to suppress �If the location sample is sensitive, suppress it. �If the location sample may reveal other information, suppress it. Id Loc 1 Loc 2 Loc 3 Loc 4 Loc 5 01 (1, 3) (1, 5) (2, 6) (2, 9) (3, 10) 02 (2, 5) (4, 8) (5, 10) (5, 15) (5, 20) 03 (0, 2) (4, 2) (5, 4) (5, 10) (6, 11) 04 (2, 3) (2, 5) (2, 8) (3, 9) (3, 15) Loc Name (2, 6) Clinic (5, 20) Hotel (3, 15) Bar ▫ Suppress the location when publishing data [Terrovitis MDM’ 08]

Privacy preservation in the publication of trajectories • Motivation ▫ Octopus RFID card is

Privacy preservation in the publication of trajectories • Motivation ▫ Octopus RFID card is commonly used by HK residents to pay for their transportations, transactions at point-of-sale services; ▫ If the Octopus company publish the data directly, it may cause privacy linkage, since other agencies may have partial knowledge of a same person. a 1 a 3 ID Trajectory t 1 a 1 ->a 3 [Terrovitis MDM’ 08]

An Example Suppress Compute distortion [Terrovitis MDM’ 08]

An Example Suppress Compute distortion [Terrovitis MDM’ 08]

Pros and Cons • Pros ▫ Protecting moving objects’ privacy even the adversaries have

Pros and Cons • Pros ▫ Protecting moving objects’ privacy even the adversaries have partial knowledge ▫ Easy to understand, low computation cost. • Cons ▫ May cause serious information loss if suppressed too much location samples.

Never Walk Alone • Motivation ▫ Due to the imprecision of GPS devices, where

Never Walk Alone • Motivation ▫ Due to the imprecision of GPS devices, where its radius δ represents the possible location imprecision • Key Idea ▫ Anonymize trajectories in a same time span under uncertainty δ [Abul ICDE’ 08]

Never Walk Alone(cont’) • Key Methods ▫ Preprocessing …… …… �Uniform trajectories in a

Never Walk Alone(cont’) • Key Methods ▫ Preprocessing …… …… �Uniform trajectories in a same time span tn t 1 ▫ Clustering �Greedy Clustering based on the Euclid distance …… Time …… ▫ (K, δ)-anonymity �Space translation y x [Abul ICDE’ 08]

Pros and cons • Pros ▫ It exploits the inherent uncertainty of location in

Pros and cons • Pros ▫ It exploits the inherent uncertainty of location in order to reduce the amount of distortion needed to anonymize data; ▫ It is a simple, efficient and effective method. • Cons ▫ It assumes a uniform uncertainty level, in some applications it is not suitable; ▫ Due to the limitation of the uncertainty level, distortion grows rapidly when K is larger.

Towards trajectory anonymity • Motivation ▫ To improve the utility of the published data

Towards trajectory anonymity • Motivation ▫ To improve the utility of the published data �Most data mining and statistical applications work on atomic trajectory • Procedure ▫ Trajectory grouping �Logic cost metric ▫ K-Anonymity ▫ Reconstruction [Nergiz TDP’ 09]

An Example Anonymization tr* of tr 1 and tr 2 Reconstruction Anonymization tr* and

An Example Anonymization tr* of tr 1 and tr 2 Reconstruction Anonymization tr* and tr 3 Randomly select points Complete [Nergiz TDP’ 09]

Conclusions • Trajectory data privacy preserving in data publication has been widely studied. •

Conclusions • Trajectory data privacy preserving in data publication has been widely studied. • Several methods are proposed in trajectory data privacy preserving, most of them come from privacy preserving in data publication. • Challenges lies in privacy preserving in high frequency sampling while providing high quality of data utility.

Scenario #2 Trajectory data mining LBS ITS Trajectory data outsourcing

Scenario #2 Trajectory data mining LBS ITS Trajectory data outsourcing

Solutions #2 overview • Protecting trajectory data privacy against attackers in the following aspects

Solutions #2 overview • Protecting trajectory data privacy against attackers in the following aspects ▫ Protecting trajectory privacy against nontrustworthy LBS server ▫ Protecting users’ privacy when acquiring LBS services, such as sending queries. ▫ Protecting data privacy while providing high quality of services. MOB’ privacy Qo. S

Navigational path privacy protection Mr. Q is going to a psychiatrist , he may

Navigational path privacy protection Mr. Q is going to a psychiatrist , he may have some psychopathic ward • Motivation ▫ Navigational path query is one of the most popular LBS, which determines a route from a source to a destination ▫ Issuing path queries to some non-trustworthy service providers may pose privacy threats How to get to the psychiatrist from home? User queries Queries Results Service providers [Lee CIKM’ 09]

Navigational path privacy protection(cont’) • Solutions ▫ Landmark: replace both source and destination of

Navigational path privacy protection(cont’) • Solutions ▫ Landmark: replace both source and destination of a path query Q(s, t) to with other locations, thus resulting in another path query Q(s’, t’) ▫ Cloaking: it may cloak both the source and destination into locations at the same street level, the result may be irrelevant. [Lee CIKM’ 09]

 • Solutions Navigational path privacy protection(cont’) ▫ Obfuscate a path query by injecting

• Solutions Navigational path privacy protection(cont’) ▫ Obfuscate a path query by injecting some fake sources and destinations S s Mr. Q ’s home • Three methods ▫ Independent obfuscate path query ▫ Shared obfuscate path query ▫ Anti-collusion path query Clinic t [Lee CIKM’ 09] T

System overview Independent obfuscate query : Obfuscate one independent path queries by randomly inject

System overview Independent obfuscate query : Obfuscate one independent path queries by randomly inject fake locations S={s. A, s 1}, T={t. A, t 1, t 2} Pb=1/2*3=1/6 Shared obfuscate query: Obfuscate two or more path together with injecting fake locations. S={s. A, s 1, s. B}, T={t. A, t 1, t 2, t. B} Pb=1/3*4=1/12 [Lee CIKM’ 09] Anti-collusion obfuscate query: Injecting more fake locations in order to get a low breach probability. S={s. A, s 1, s 2, s. B}, T={t. A, t 1, t 2, t. B} Pbmin=1/4*5=1/20; Pbmax=1/2*3=1/6

Pros and Cons • Pros ▫ Developed a framework to obfuscate path queries in

Pros and Cons • Pros ▫ Developed a framework to obfuscate path queries in order to protect mobile users’ trajectory privacy ▫ Mixing some fake sources and destinations greatly reduced the breach probability • Cons ▫ Provide weak privacy protection when the adversary have strong background knowledge

Cut-Enclose • Motivation ▫ Overlapping of trajectory anonymity rectangles may cause location privacy linkage

Cut-Enclose • Motivation ▫ Overlapping of trajectory anonymity rectangles may cause location privacy linkage Problems with existing methods [ti-1, ti] [ti, ti+1] [ti+1, ti+2] Problems with simple cut-enclose ▫ Simply cut and enclose methods may cause privacy leakage in the joint of grids Time delay factor [Gidofalvi MDM’ 07]

Cut-Enclose(cont’) • Procedure ▫ Users set privacy levels (individual privacy level/region sensitive level); ▫

Cut-Enclose(cont’) • Procedure ▫ Users set privacy levels (individual privacy level/region sensitive level); ▫ Separate 2 D space into grids; ▫ According to user specified individual privacy level (CRP /IRP)or region sensitive level(IIR), combine girds into partitions; ▫ Anonymize trajectory pieces in each partition with time delay factor. Common Regular Partitioning Individual Irregular Partitioning Anonymized trajectory [Gidofalvi MDM’ 07]

Anonymity with historical data • Motivation ▫ Existing cloaking methods highly depend on the

Anonymity with historical data • Motivation ▫ Existing cloaking methods highly depend on the network density ; ? ▫ Existing methods are not suitable for time-series sequence �The cloaking box form a trajectory that may disclose a user’s trajectory. [Toby INFOCOM’ 08]

Anonymity with historical data(cont’) • Procedure Cloaking K-1 additive trajectory 1. Liner: the cloaking

Anonymity with historical data(cont’) • Procedure Cloaking K-1 additive trajectory 1. Liner: the cloaking result is considered as a new base trajectory T 0 2. Quadratic: the selection of the new trajectory is based on its distance to T, not T 0 Clocking one additive trajectory 1. Select a pivot for each footprint; 2. Choose the one with the smallest MBC and index No. as the next pivot; 3. Until all trajectory points of the base trajectory is all anonymized. C 2 C 1 c 2 c 1 T 0 C 4 C 3 c 4 c 3 a 4 Ta Tb a 2 a 1 b 1 a 5 b 4 b 2 a 3 b 3 a 8 a 7 b 5 b 7 a 6 b 6 [Toby INFOCOM’ 08]

Senario #3 Trajectory data mining LBS ITS Trajectory data outsourcing

Senario #3 Trajectory data mining LBS ITS Trajectory data outsourcing

Privacy preserving traffic monitoring • Motivation ▫ GPS-equipped vehicles send their location info to

Privacy preserving traffic monitoring • Motivation ▫ GPS-equipped vehicles send their location info to traffic monitoring center in a regular frequency ▫ The location traces might reveal sensitive places that drivers have visited [Hoh Mobi. Sys’ 08]

Privacy preserving traffic monitoring(cont’) • Key Idea ▫ Minimizing tracking time reduces the risk

Privacy preserving traffic monitoring(cont’) • Key Idea ▫ Minimizing tracking time reduces the risk that an adversary can correlate an identity with sensitive locations • Method ▫ A time-to-confusion level ▫ An uncertainty level [Hoh Mobi. Sys’ 08]

Conclusions • Trajectory data privacy preserving in online applications are necessary, no dominant methods

Conclusions • Trajectory data privacy preserving in online applications are necessary, no dominant methods exists to solve this problem. • Challenges lies in the current trajectory privacy preserving without location privacy leakage while providing high quality of online services.

Scenario #4 Trajectory data mining ITS LBS Trajectory data outsourcing

Scenario #4 Trajectory data mining ITS LBS Trajectory data outsourcing

Solutions #2 overview • Motivation ▫ Cloud emerges as a new way of Daa.

Solutions #2 overview • Motivation ▫ Cloud emerges as a new way of Daa. S; ▫ More and more agencies are moving their data to the cloud, they worried the privacy and security in the cloud; ▫ Privacy protection in the cloud is necessary. Dark Cloud Green Cloud [Xu Proposal’ 10]

Privacy Threats in the Cloud • Users’ Query Privacy ▫ Eg. Mr. Q want

Privacy Threats in the Cloud • Users’ Query Privacy ▫ Eg. Mr. Q want to protect his query against the Cloud, since his query is about mental disease • Data Privacy of the Data Owner • Mutual Privacy ▫ Semi-honest model Data Owner Data Query Cloud Results [Xu Proposal’ 10]

Main Framework Data Owner encrypts the database R and sends it to the Cloud

Main Framework Data Owner encrypts the database R and sends it to the Cloud Data Owner sends a shadow index E(I) and S-1() to the client, and sends E-1() to the Cloud for the following processing The Cloud decrypted Ec(E(i)) to get Ec(i), return it to the client. E(i) is retrieved locally and encrypts as Ec(E(i)), then sent back to the Cloud for decryption [Xu Proposal’ 10] If it is a leaf node, decrypt it with S-1(), get the result. If it is not a leaf node, get the next i

Research issue • Efficient Privacy-Preserving Query Processing Techniques ▫ Challenges lie in those complex

Research issue • Efficient Privacy-Preserving Query Processing Techniques ▫ Challenges lie in those complex queries, especially queries that are based on distances. Typical examples like k-nearest neighbor (k. NN) • Privacy-Aware Query Result Authentication Techniques Cloud Results “Nearest Clinic” ▫ If the cloud is malicious or does not follow the protocol faithfully, there is a need for the client to authenticate the correctness of query results [Xu Proposal’ 10]

OUTLINE • Motivating Applications • Privacy-preserving in different scenarios • Conclusions & Future work

OUTLINE • Motivating Applications • Privacy-preserving in different scenarios • Conclusions & Future work

CONCLUSIONS • This survey discussed trajectory data privacy preservation techniques ▫ For online trajectory

CONCLUSIONS • This survey discussed trajectory data privacy preservation techniques ▫ For online trajectory data privacy preservation, service is centric, trade-off is between Qo. S and privacy preservation ▫ For offline trajectory data privacy preservation, data is centric, trade-off is between data quality and privacy preservation • Most of the techniques deals with this problem in free space, and most of them are offline algorithms

FUTURE WORK ○ Complete the survey in following aspects: 1. Privacy preserving in time-series

FUTURE WORK ○ Complete the survey in following aspects: 1. Privacy preserving in time-series data. 2. Privacy preserving in outsourcing data. 3. …… ○ Trajectory data protection in online applications ● Trajectory data protection in data publication / data outsourcing 1. ITS/LBS 2. Trajectory data outsourcing

References • G. Gidofalvi, X. Huang, and T. B. Pedersen. Privacy-Preserving Data Mining on

References • G. Gidofalvi, X. Huang, and T. B. Pedersen. Privacy-Preserving Data Mining on Moving Object Trajectories, In proceedings of MDM’ 07, 2007 • J. Krumm. Inference attacks on location tracks. In Proceedings of the 5 th International Conference on Pervasive Computing (Pervasive 2007), May 2007. • M. Terrovitis, and N. Mamoulis. Privacy Preserving in the Publication of Trajectories. In proceedings of MDM’ 08, 2008 • A. Gkoulalas-Divanis, V. S. Verykios. A Privacy-Aware Trajectory Tracking Query Engine. In proceedings of SIGKDD 2008. • Mehmet Eran Nergiz, Maurizio Atzori, Yucel Saygin, Baris Guc. Towards Trajectory Anonymization: a Generalization-Based Approach. IEEE Transactions on Data Privacy 2(2009) 47 -75. • Tun-Hao You, Wen-Chih Peng, Wang-Chien Lee. Protecting Moving Trajectories with Dummies. In proceedings of PALMS 2007. • Kido H. , Yanagisawa Y. , Satoh T. . An anonymous communication technique using dummies for location based services. In proceedings of ICPS 2005 • O. Abul, F. Bonchi, and M. Nanni. Never Walk Alone: Uncertainty for Anonymity in Moving Objects Databases. In proceeding of ICDE 2008. • G. Ghinita. Private Queries and Trajectory Anonymization: a Dual Perspective on Location Privacy. Transactions on Data Privacy 2009(3 -19). • V. Rastogi, S. Nath. Differentially Private Aggregation of Distributed Time-Series with Transformation and Encryption. In proceedings of SIGMOD ’ 10, 2010. • T. Xu, Y. Cai. Exploring Historical Location Data for Anonymity Preservation in Location-based Services. In Proceedings of INFOCOM’ 08, 2008. • K. C. K. Lee, W. Lee, H. Va Leong, B. Zheng. Navigational Path Privacy Protection. In Proceedings of CIKM’ 09 2009.

References(cont’) • A. Gkoulalas-Divanis, V. S. Verykios, M. F. Mokbel. Identifying Unsafe Routes for

References(cont’) • A. Gkoulalas-Divanis, V. S. Verykios, M. F. Mokbel. Identifying Unsafe Routes for Network-Based Trajectory Privacy. In Proceedings of SPC’ 09. 2009 • O. Abul, M. Atzori, F. Bonchi, F. Giannotti. Hiding Sensitive Trajectory Patterns. In Proceedings of ICDMW’ 07, 2007. • M. Gruteser, X. Liu. Protecting Privacy in Continuous Location-Tracking Applications. In IEEE Security and Privacy, 2004. • X. Pan, X. Meng, J. Xu. Distortion-based Anonymity for Continuous Queries in Location-Based Mobile Services. In Proceedings of SIGGIS’ 09, 2009. • S. Mukherjee , Z. Chen, A. Gangopadhyay. A privacy-preserving technique for Euclidean distancebased mining algorithms using Fourier-related transforms. In. VLDB Journal (2006) 15: 293– 315 • B. Hoh, M. Gruteser, H. Xiong, A. Alrabady. Preserving Privacy in GPS Traces via Uncertainty. Aware Path Cloaking. In proceedings of CCS’ 07, 2007

Thanks for your time! I got your interests~ Q&A

Thanks for your time! I got your interests~ Q&A