Graph SC Parallel Secure Computation Made Easy Kartik

Users Data Mining on User Data Mining Engine Privacy concern! Data Model 2

Companies Computing on Private Data Graph representing social connections Graph representing professional connections Compute

Companies want to run machine learning algorithms Users/Companies do NOT to reveal data want

Cryptography to the rescue: Secure Multiparty Computation Ensures that we learn only the outcome

Key Challenges Generic Solutions 1 Lot of work improving individual algorithms Departure from one-at-a-time

Key Challenges 2 Convert Program to Run on Secure Computation (Cost of obliviousness) 7

Key Challenges 3 Parallelizability There’s a lot of data – maintain benefits of parallelism

Key Contributions Challenge: Generic Solutions Generic Framework for “Graph-parallel” Algorithms Page. Rank Pregel by

Key Contributions Efficiently Convert Graph. Challenge: Convert program to parallel Programs to run on

Key Contributions Challenge: Parallelizability Maintain Parallelizability Depth of the computation is O(log |V|) Matrix

Key Contributions 1 Generic Framework for Graph-parallel Algorithms 2 Efficiently Convert to Oblivious Programs

Programmer’s favorite model Cryptographer’s favorite model function bs(val, s, t) mid = (s +

Programmer’s model: Programs Oblivious Programs Cryptographer’s model: Circuits Intuitively, Program traces should not depend

Programmer’s Hard model: Programs Oblivious Programs Easy Cryptographer’s model: Circuits Intuitively, Program traces should

Achieving Parallelism Oblivious Parallel RAM [BCP’ 14] Polylogarithmic Blowup: Not practical Graph. SC: O(log

Pregel by “Graph-parallel” algorithms [LGKB’ 10, GLGBG’ 12, MABDHLC’ 10, ZCF’ 10] 19

Graph-parallel Algorithms 1 0 C 1 1 2 D Scatter: Send data to edges

Obliviousness of Graph-parallel Algorithms 1 1 D Do not reveal edge/vertex data 2 1

Oblivious Gather – Key Trick Oblivious Sort with (v, is. Vertex) Single pass Sort:

Complexity of Our Algorithms Sequential Insecure (Total Work) Naïve Oblivious (Total Work) Parallel Oblivious

Algorithms on Graph. SC üHistogram computation üPage. Rank üMatrix Factorization using gradient descent üMatrix

Experimental Setup Cloud 1 (Garblers) Cloud 2 (Evaluators) … Two Scenarios: 1. LAN 2.

Key Evaluation Results Input Size Parallel Time (32 processors) Histogram 1 K – 0.

Running at Scale We used only 7 Matrix Factorizationmachines! using gradient descent: 1 M

Across Data Centers Page Rank Garblers: Oregon Evaluators: N. Virginia B/W provisioned: 2 Gbps

Conclusion Graph. SC is a parallel secure computation framework for Graph-parallel algorithms www. oblivm.

Slides: 30

Download presentation

Graph. SC: Parallel Secure Computation Made Easy Kartik Nayak With Xiao Shaun Wang, Stratis Ioannidis, Udi Weinsberg, Nina Taft, Elaine Shi 1

Users Data Mining on User Data Mining Engine Privacy concern! Data Model 2

Companies Computing on Private Data Graph representing social connections Graph representing professional connections Compute user’s influence in both circles 3

Companies want to run machine learning algorithms Users/Companies do NOT to reveal data want Can we enable this in practice? 4

Cryptography to the rescue: Secure Multiparty Computation Ensures that we learn only the outcome 5

Key Challenges Generic Solutions 1 Lot of work improving individual algorithms Departure from one-at-a-time approach 6

Key Challenges 2 Convert Program to Run on Secure Computation (Cost of obliviousness) 7

Key Challenges 3 Parallelizability There’s a lot of data – maintain benefits of parallelism in the insecure setting With cryptography, expensive computation 8

Key Contributions 9

Key Contributions Challenge: Generic Solutions Generic Framework for “Graph-parallel” Algorithms Page. Rank Pregel by Risk Minimization using ADMM And many more Matrix Factorization using ALS Matrix Factorization using gradient descent 10

Key Contributions Efficiently Convert Graph. Challenge: Convert program to parallel Programs to run on Secure Oblivious Programs Computation Total work blowup is O(log |V|) Blowup for naïve solution: O(|V|) for sparse graphs 11

Key Contributions Challenge: Parallelizability Maintain Parallelizability Depth of the computation is O(log |V|) Matrix Factorization: 4 K ratings, 32 threads [NIWJTB’ 13] 1. 4 hours < 4 mins 12

Key Contributions 1 Generic Framework for Graph-parallel Algorithms 2 Efficiently Convert to Oblivious Programs 3 Maintain Parallelizability 13

Programmer’s favorite model Cryptographer’s favorite model function bs(val, s, t) mid = (s + t) / 2; if (val < mem[mid]) bs(val, 0, mid) else bs(val, mid+1, t) 14

Programmer’s model: Programs Oblivious Programs Cryptographer’s model: Circuits Intuitively, Program traces should not depend on input data 15

Programmer’s favorite model Cryptographer’s favorite model function bs(val, s, t) mid = (s + t) / 2; if (val < mem[mid]) bs(val, 0, mid) else bs(val, mid+1, t) 16

Programmer’s Hard model: Programs Oblivious Programs Easy Cryptographer’s model: Circuits Intuitively, Program traces should not depend on input data 17

Achieving Parallelism Oblivious Parallel RAM [BCP’ 14] Polylogarithmic Blowup: Not practical Graph. SC: O(log |V|) blowup Goal: Low Depth Circuits 18

Pregel by “Graph-parallel” algorithms [LGKB’ 10, GLGBG’ 12, MABDHLC’ 10, ZCF’ 10] 19

Graph-parallel Algorithms 1 0 C 1 1 2 D Scatter: Send data to edges 2 1 1 B 4 1 A 5 Gather: Aggregate data from 4 7 edges Apply: Perform some computation 20

Obliviousness of Graph-parallel Algorithms 1 1 D Do not reveal edge/vertex data 2 1 0 C 1 B Do not reveal structure of the graph A 4 1 7 Naïve Solution: O(|V|2) Our Solution: O(|E| log|V|) 21

Oblivious Gather – Key Trick 2 3 1 4 22

Oblivious Gather – Key Trick Oblivious Sort with (v, is. Vertex) Single pass Sort: O(|E| log |V|) Single pass: O(|E|) Oblivious Gather: (|E| log |V|) Gather in clear: O(|E|) 23

Complexity of Our Algorithms Sequential Insecure (Total Work) Naïve Oblivious (Total Work) Parallel Oblivious (Parallel Time) Scatter Gather O(|E|) O(|V|2) O(|E| log |V|) O(log |V|) Apply O(|V|) O(|E|) O(1) 24

Algorithms on Graph. SC üHistogram computation üPage. Rank üMatrix Factorization using gradient descent üMatrix Factorization using alternating least squares Bellman-Ford shortest path Pregel by Bipartite matching Parallel empirical risk minimization through alternating direction method of multipliers (ADMM) 25

Experimental Setup Cloud 1 (Garblers) Cloud 2 (Evaluators) … Two Scenarios: 1. LAN 2. Across Data Centers (WAN) 26 …

Key Evaluation Results Input Size Parallel Time (32 processors) Histogram 1 K – 0. 5 M 4 sec – 34 min Page. Rank (1 iteration) 4 K – 128 K 20 sec – 15. 5 min Using GD 1 K – 32 K Using ALS 64 – 4 K Matrix Factorization (1 iteration) 47 sec – 34 min 2 min – 2. 35 hours 27

Running at Scale We used only 7 Matrix Factorizationmachines! using gradient descent: 1 M ratings, 6 K users, 4 K movies [KBV’ 09] 4 K ratings, 32 threads -> mins 1. 4 few hours < 4 mins Time taken: ~13 13 hours (1 iteration) by using Max: more 16 K ratings (64 x smaller data) [NIWJTB’ 13] machines 7 machine cluster, 128 processors, 525 GB RAM 28

Across Data Centers Page Rank Garblers: Oregon Evaluators: N. Virginia B/W provisioned: 2 Gbps Time reduces linearly with increasing processors 29

Conclusion Graph. SC is a parallel secure computation framework for Graph-parallel algorithms www. oblivm. com Thank You! kartik@cs. umd. edu 30