Differentially Private Analysis of Graphs and Social Networks
- Slides: 27
Differentially Private Analysis of Graphs and Social Networks Sofya Raskhodnikova Pennsylvania State University 1
Graphs and networks Many types of data can be represented as graphs, where nodes represent individuals and edges capture relationships. Image source: Nykamp DQ, “An introduction to networks. ” From Math Insight. http: //mathinsight. org/network_introduction. 2
Potentially sensitive information in graphs • • • Social, romantic and sexual relationships “Friendships” in an online social network Financial transactions Phone calls and email communication Doctor-patient relationships Source: Christakis, Fowler. The Spread of Obesity in a Large Social Network over 32 Years. N Engl J Med 2007; 357: 370 -379 Source: B. Aven. The effects of corruption on organizational networks and individual behavior. MIT workshop: Information and Decision in Social Networks, 2011. 3
Two conflicting goals • Privacy: protecting information of individuals. • Utility: drawing accurate conclusions about aggregate information. Privacy Utility 4
``Anonymized’’ graphs still pose privacy risk • False dichotomy: personally identifying vs. non-personally identifying information. • Links and any other information about individual can be used for de-anonymization. In a typical real-life network, many nodes have unique neighborhoods. Bearman, Moody, Stovel. Chains of affection: The structure of adolescent romantic and sexual networks, American J. Sociology, 2008 5
Some published de-anonymization attacks – Movie ratings [Narayanan, Shmatikov 08] De-identified Netflix users based on information from a public movie database IMDb. – Social networks [Backstrom, Dwork, Kleinberg 07; Movies People Narayanan, Shmatikov 09; Narayanan, Shi, Rubinstein 12] Re-identified users in an online social network (anonymized Twitter) based information from a public online social network (Flickr). – Computer networks [Coull, Wright, Monrose, Collins, Reiter 07; Ribeiro, Chen, Miklau, Townsley 08, …] Can reidentify individuals based on external sources. 6
Who’d want to de-anonymize a social network graph? • • • Government agency for surveillance. A phisher/spammer to write a personalized message. Health insurance provider to check preexisting conditions. Marketers to focus advertising on influential nodes. Stalkers, nosy neighbors, colleagues, or employers. image sources: Andrew Joyner, http: //dukeromkey. com/ 7
What information can be released without violating privacy? 8
Differential privacy (for graph data) Graph G Data processing Data release Algorithm output • image source http: //www. queticointernetmarketing. com/new-amazing-facebook-photo-mapper/ 9
Two variants of differential privacy for graphs • Edge differential privacy G: Two graphs are neighbors if they differ in one edge. • Node differential privacy G: Two graphs are neighbors if one can be obtained from the other by deleting a node and its adjacent edges. 10
Differential privacy (for graph data) Graph G Data processing Data release Algorithm output • image source http: //www. queticointernetmarketing. com/new-amazing-facebook-photo-mapper/ 11
Some useful properties of differential privacy • 12
Is differential privacy too strong? • No weaker notion has been proposed that satisfies all three useful properties. • We can actually attain it for many useful statistics! 13
What graph statistics can be computed accurately with differential privacy? 14
Graph statistics … • … Fraction of nodes of degree d … … Degree d The degree of a node is the number of connections it has. 15
Tools used in differentially private graph algorithms • Smooth sensitivity – A more nuanced notion of sensitivity than the one mentioned in the previous talk • • • Sample and aggregate Maximum flow Linear and convex programming Random projections Iterative updates Postprocessing 16
Differentially private graph analysis A taste of techniques 17
Basic question: how to compute a statistic f Graph G Data processing Algorithm Data release image source http: //www. queticointernetmarketing. com/new-amazing-facebook-photo-mapper/ 18
Challenge for node privacy: high sensitivity • 19
Challenge for node privacy: high sensitivity • 20
Idea: project onto graphs with low sensitivity. [Kasiviswanathan Nissim Raskhodnikova Smith 13] See also [Blocki Blum Datta Sheffet 13, Chen Zhou 13] 21
“Projections” on graphs of small degree • All graphs 22
Lipschitz extensions • All graphs 23
Summary • Accurate subgraph counts for realistic graphs can be computed by node-private algorithms – Use Lipschitz extensions and linear programming – It is one example of many graph statistics that node-private algorithms do well on. 24
What can’t be computed differentially privately? • Differential privacy explicitly excludes the possibility of computing anything that depends on one person’s data: – Is there a node in the graph that has atypical connections? – ``suspicious communication patterns’’? 25
What we are working on • Node differentially private algorithms for releasing – a large number of graph statistics at once – synthetic graphs • Exciting area of research: – Edge-private algorithms [Nissim, Raskhodnikova, Smith 07; Hay, Rastogi, Miklau, Suciu 09; Hay, Li, Miklau, Jensen 09; Hardt, Rothblum 10; Karwa, Raskhodnikova, Smith, Yaroslavtsev 11; Karwa, Slavkovic 12; Blocki, Blum, Datta, Sheffet 12; Gupta, Roth, Ullman 12; Mir, Wright 12; Kifer, Lin 13, …] – Node-private algorithms [Gehrke Lui Pass 12; Blocki Blum Datta Sheffet 13, Kasiviswanathan Nissim Raskhodnikova Smith 13, Chen Zhou 13, Raskhodnikova Smith, . . ] 26
Conclusions • We are close to having edge-private and node-private algorithms that work well in practice for many basic graph statistics. • Accurate node-private algorithms were thought to be impossible only a few years ago. • Differential privacy is influencing other scientific disciplines – Next talk: reducing false discovery rate. 27
- Differentially
- Networks and graphs: circuits, paths, and graph structures
- Good and bad state graphs
- Graphs that enlighten and graphs that deceive
- Measurement and analysis of online social networks
- Measurement and analysis of online social networks
- Vc vs datagram
- Graphs that compare distance and time are called
- Which two graphs are graphs of polynomial functions?
- Kronecker graphs: an approach to modeling networks
- Basestore iptv
- Private industrial network examples
- Social networks and groupware in cloud computing
- " dr. jan" and "social networks"
- Automated generation and analysis of attack graphs
- Social thinking and social influence
- Social thinking social influence social relations
- Mining social network graphs
- Finding a team of experts in social networks
- Social networks asset managers
- Header space analysis: static checking for networks
- Marriage is a private affair questions
- Public cloud vs private cloud cost analysis
- Social darwinism paralleled the economic doctrine of
- How to understand graphs and charts
- Benefits of transferring data over a wired network
- Difference between social action and social interaction pdf
- Relationship between political science and social work