Privacy and FaultTolerance in Distributed Optimization Nitin Vaidya
Privacy and Fault-Tolerance in Distributed Optimization Nitin Vaidya University of Illinois at Urbana-Champaign
Acknowledgements Shripad Gade Lili Su
i
Applications g g fi(x) = cost for robot i to go to location x Minimize total cost of rendezvous f 1(x) x x 1 x 2 f 2(x) i
Applications f 1(x) f 2(x) Learning Minimize cost Σ fi(x) i f 3(x) f 4(x) 5
Outline i g g g Distributed Optimization g g g g Privacy g g g Fault-tolerance
Distributed Optimization Server g g g 7
Client-Server Architecture Server g g f 1(x) f 2(x) f 3(x) f 4(x) g 8
Client-Server Architecture g g Server g g g
Client-Server Architecture g g Server g g
Client-Server Architecture g Server g g
Variations g Stochastic g Asynchronous g … 12
Peer-to-Peer Architecture g g f 1(x) f 2(x) f 3(x) f 4(x) g g g
Peer-to-Peer Architecture g g g Each agent maintains local estimate x Consensus step with neighbors Apply own gradient to own estimate g g
Outline i g g g Distributed Optimization g g g g Privacy g g g Fault-tolerance
Server g g g g
Server g g g g Server observes gradients privacy compromised
Server g g g g Server observes gradients privacy compromised Achieve privacy and yet collaboratively optimize
Related Work g Cryptographic methods (homomorphic encryption) g Function transformation g Differential privacy 19
Differential Privacy Server g g g g 20
Differential Privacy Server g g g g Trade-off privacy with accuracy 21
Proposed Approach g Motivated by secret sharing g Exploit diversity … Multiple servers / neighbors 22
Proposed Approach Server 2 Server 1 g g g Privacy if subset of servers adversarial 23
Proposed Approach g g g Privacy if subset of neighbors adversarial 24
Proposed Approach g Structured noise that “cancels” over servers/neighbors 25
Intuition x 2 x 1 Server 2 Server 1 g g g 26
Intuition x 2 x 1 Server 2 Server 1 g g Each client simulates multiple clients 27
Intuition x 2 x 1 Server 2 Server 1 g g g g 28
Algorithm g
Algorithm g
Claim g Under suitable assumptions, servers eventually reach consensus in i 31
Privacy g g Server 2 Server 1 g g 32
Privacy g g g Server 2 Server 1 g g 33
g g Function splitting not necessarily practical g Structured randomization as an alternative 34
Structured Randomization g Multiplicative or additive noise in gradients g Noise cancels over servers 35
Multiplicative Noise x 2 x 1 Server 2 Server 1 g g g 36
Multiplicative Noise x 2 x 1 Server 2 Server 1 g g g 37
Multiplicative Noise x 2 x 1 Server 2 Server 1 g g 38
Multiplicative Noise x 2 x 1 Server 2 Server 1 g g Suffices for this invariant to hold over a larger number of iterations
Multiplicative Noise x 2 x 1 Server 2 Server 1 g g Noise from client i to server j not zero-mean
Claim g Under suitable assumptions, servers eventually reach consensus in i 41
Peer-to-Peer Architecture g g g
Reminder … g g g Each agent maintains local estimate x Consensus step with neighbors Apply own gradient to own estimate g g
Proposed Approach g Each agent shares noisy estimate with neighbors • Scheme 1 – Noise cancels over neighbors • Scheme 2 – Noise cancels network-wide g g g
Proposed Approach g Each agent shares noisy estimate with neighbors • Scheme 1 – Noise cancels over neighbors • Scheme 2 – Noise cancels network-wide x + ε 1 g g ε 1 + ε 2 = 0 g g x + ε 2 g (over iterations)
Peer-to-Peer Architecture g Poster today Shripad Gade
Outline i g g g Distributed Optimization g g g g Privacy g g g Fault-tolerance
Fault-Tolerance g Some agents may be faulty g Need to produce “correct” output despite the faults 48
Byzantine Fault Model g No constraint on misbehavior of a faulty agent g May send bogus messages g Faulty agents can collude 49
Peer-to-Peer Architecture g g fi(x) = cost for robot i to go to location x Faulty agent may choose arbitrary cost function f 1(x) x x 1 x 2 f 2(x)
Peer-to-Peer Architecture g g g 51
Client-Server Architecture Server g g g g
Fault-Tolerant Optimization g The original problem is not meaningful i 53
Fault-Tolerant Optimization g The original problem is not meaningful i g Optimize cost over only non-faulty agents i good
Fault-Tolerant Optimization g The original problem is not meaningful i g Optimize cost over only non-faulty agents Impossible! i good
Fault-Tolerant Optimization g Optimize weighted cost over only non-faulty agents �� i i good g With �� i as close to 1/ good as possible
Fault-Tolerant Optimization g Optimize weighted cost over only non-faulty agents �� i i good With t Byzantine faulty agents: t weights may be 0
Fault-Tolerant Optimization g Optimize weighted cost over only non-faulty agents �� i i good t Byzantine agents, n total agents At least n-2 t weights guaranteed to be > 1/2(n-t)
Centralized Algorithm g Of the n agents, any t may be faulty g How to filter cost functions of faulty agents? X
Centralized Algorithm: Scalar argument x Define a virtual function G(x) whose gradient is obtained as follows 60
Centralized Algorithm: Scalar argument x Define a virtual function G(x) whose gradient is obtained as follows At a given x g Sort the gradients of the n local cost functions 61
Centralized Algorithm: Scalar argument x Define a virtual function G(x) whose gradient is obtained as follows At a given x g g Sort the gradients of the n local cost functions Discard smallest t and largest t gradients 62
Centralized Algorithm: Scalar argument x Define a virtual function G(x) whose gradient is obtained as follows At a given x g g g Sort the gradients of the n local cost functions Discard smallest t and largest t gradients Mean of remaining gradients = Gradient of G at x 63
Centralized Algorithm: Scalar argument x Define a virtual function G(x) whose gradient is obtained as follows At a given x g g g Sort the gradients of the n local cost functions Discard smallest t and largest t gradients Mean of remaining gradients = Gradient of G at x Virtual function G(x) is convex
Centralized Algorithm: Scalar argument x Define a virtual function G(x) whose gradient is obtained as follows At a given x g g g Sort the gradients of the n local cost functions Discard smallest t and largest t gradients Mean of remaining gradients = Gradient of G at x Virtual function G(x) is convex Can optimize easily
Peer-to-Peer Fault-Tolerant Optimization g Gradient filtering similar to centralized algorithm … require “rich enough” connectivity … correlation between functions helps g Vector case harder … redundancy between functions helps 66
Summary i g g g Distributed Optimization g g g g Privacy g g g Fault-tolerance
Thanks! disc. ece. illinois. edu
69
70
Distributed Peer-to-Peer Optimization Each agent maintains local estimate x In each iteration g Compute weighted average with neighbors’ estimates g g g
Distributed Peer-to-Peer Optimization Each agent maintains local estimate x In each iteration g Compute weighted average with neighbors’ estimates g Apply own gradient to own estimate g g g g
Distributed Peer-to-Peer Optimization Each agent maintains local estimate x In each iteration g Compute weighted average with neighbors’ estimates g Apply own gradient to own estimate g g g Local estimates converge to g g g i
RSS – Locally Balanced g 74
RSS – Network Balanced g 75
Convergence g 76
Function Sharing g 77
Function Sharing - Convergence g 78
- Slides: 78