Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk

Secure communication Alice Bob a {0, 1}n b {0, 1}n • Want to compute

Secure Function Evaluation (SFE) • [Yao, GMW]: If F computed by circuit C, then

Secure and Efficient Function Evaluation • Can we achieve sublinear communication? • With sublinear

Private Approximation • [FIMNSW’ 01]: A protocol computing an approximation G(a, b) of F(a,

Approximating Hamming Distance • [FIMNSW 01]: A private protocol with complexity O~(n 1/2/ )

Crypto Tools Efficient OT 1 n: – P 1 has A[1] … A[n] 2

High-dimensional tools • Random projection: – Take a random orthonormal n n matrix D,

Approximating ||a-b|| • Recall: – Alice has a 2 [M]d, Bob has b 2

Protocol Intuition 1. Alice and Bob agree upon a random orthonormal matrix D •

Protocol Intuition Con’d 1. 2. 3. Alice and Bob agree upon random orthonormal D

Protocol Intuition Con’d … … 3. Use secure circuit with ROMs Da, Db to:

One last detail • Want to show final choice of T is simulatable •

Algorithm vs. Simulation SIMULATION ALGORITHM • Repeat – Generate L independent bits zi such

Other Results • Use homomorphic encryption tricks to get better upper bounds for private

Slides: 16

Download presentation

Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT TCC 2006 David Woodruff MIT, Tsinghua

Secure communication Alice Bob a {0, 1}n b {0, 1}n • Want to compute some function F(a, b) • Security: protocol does not reveal anything except for the value F(a, b) – Semi-honest: both parties follow protocol – Malicious: parties are adversarial • Efficiency: want to exchange few bits

Secure Function Evaluation (SFE) • [Yao, GMW]: If F computed by circuit C, then F can be computed securely with O~(|C|) bits of communication • [GMW] + … + [NN]: can assume parties semihonest – Semi-honest protocol can be compiled to give security against malicious parties • Problem: circuit size at least linear in n * O~() hides factors poly(k, log n)

Secure and Efficient Function Evaluation • Can we achieve sublinear communication? • With sublinear communication, many interesting problems can be solved only approximately. • What does it mean to have a private approximation? • Efficiency: want SFE with communication comparable to insecure case

Private Approximation • [FIMNSW’ 01]: A protocol computing an approximation G(a, b) of F(a, b) is private, if each party can simulate its view of the protocol given the exact value F(a, b) • Not sufficient to simulate non-private G(a, b) using SFE • Example: – Define G(a, b): • bin(G(a, b))i =bin( (a, b))i if i>0 • bin(G(a, b))0=a 0 – G(a, b) is a 1 -approximation of (a, b), but not private • Popular protocols for approximating (a, b), e. g. , [KOR 98], are not private

Approximating Hamming Distance • [FIMNSW 01]: A private protocol with complexity O~(n 1/2/ ) – (a, b) small: compute (a, b) exactly in O~( (a, b)) bits – (a, b) high: sample O~(n/ (a, b)) (a-b)i, estimate (a, b) • Our main result: – Complexity: O~(1/ 2) bits – Works even for L 2 norm, i. e. , estimates ||a-b||2 for a, b {1…M}n * O~() hides factors poly(k, log n, log M, log 1/ )

Crypto Tools Efficient OT 1 n: – P 1 has A[1] … A[n] 2 {0, 1}m , P 2 has i 2 [n] – Goal: P 2 privately learns A[i], P 1 learns nothing – Can be done using O~(m) communication [CMS 99, NP 99] Circuits with ROM [NN 01] (augments [Yao 86]) – Standard AND/OR/NOT gates – Lookup gates: • In: i • Out: Mgate[i] – Can just focus on privacy of the output Communication at most O~(m|C|)

High-dimensional tools • Random projection: – Take a random orthonormal n n matrix D, that is ||Dx|| = ||x|| for all x. – There exists c>0 s. t. for any x Rn, i=1…n Pr[ (Dx)i 2 > ||Dx||2/n * k] < e-ck

Approximating ||a-b|| • Recall: – Alice has a 2 [M]d, Bob has b 2 [M]d – Goal: privately estimate ||a-b||, x=a-b – Suffices to estimate ||a-b||2

Protocol Intuition 1. Alice and Bob agree upon a random orthonormal matrix D • Efficient by exchanging a seed of a PRG 2. Alice and Bob rotate vectors a, b, obtaining Da, Db • ||Da-Db|| = ||a-b|| • D “spreads the mass” of the difference vector uniformly across the n coordinates. • Can now try obliviously sampling coordinates as in [FIMNSW 01]

Protocol Intuition Con’d 1. 2. 3. Alice and Bob agree upon random orthonormal D Alice and Bob rotate a, b, obtaining Da, Db Use secure circuit with ROMs Da and Db to: i. Circuit obtains (Da)i and (Db)i for many random indices i Problem: Now what? Samples leak a lot of info! Fix: - Suppose you know upper bound T with T ¸ ||a-b||2 - Flip a coin z with heads probability n((Da)i – (Db)i)2/(k. T) - Then E[z] = n||Da-Db||2/(nk. T) = ||a-b||2/(k. T) - E[z] only depends on ||a-b||, and z only depends on E[z]!

Protocol Intuition Con’d 1. 2. 3. Alice and Bob agree upon random orthonormal D Alice and Bob rotate a, b, obtaining Da, Db Use secure circuit with ROMs Da, Db, to: i. iii. Obtain (Da)i and (Db)i for L random i Generate Bernoulli z 1, … , z. L with E[zi] = ||a-b||2/(k. T) Output k. T zi/L Privacy: View only depends on ||a-b|| Problem: Correctness! A priori bound T=M 2 n, but ||a-b||2 may be (1), so (n) samples required. Fix: Private binary search on T

Protocol Intuition Con’d … … 3. Use secure circuit with ROMs Da, Db to: i. iii. Obtain (Da)i and (Db)i for L random i Generate Bernoulli z 1, … , z. L with E[zi] = ||a-b||2/(k. T) Output k. T zi/L Fix: - Private binary search on T - If many zi = 0, then intuitively can replace T with T/2 - Eventually T = ~(||a-b||2) - We will show: final choice of T is simulatable!

One last detail • Want to show final choice of T is simulatable • Estimate is k. T zi/L and we stop when “many” zi = 1 • Recall E[zi] = ||a-b||2/(k. T) Key Observation: Since orthonormal D is uniformly random, can guarantee that if many zi = 0, then ||a-b||2 << T. Note: - Suppose didn’t use D, and a = (M, 0, …, 0), b = (0, …, 0) - Then ||a-b||2 = M 2 is large, but almost always zi = 0, so you’ll choose T < ||a-b||2. - Not simulatable since T depends on the structure of a, b

Algorithm vs. Simulation SIMULATION ALGORITHM • Repeat – Generate L independent bits zi such that Pr[zi=1]= ||D(a-b)|| 2/Tk Pr[z =1]= ||a-b|| i – T=T/2 • Until Σi zi ≥ (L/k) • Output E= Σi zi /L * 2 Tk as an estimate of ||a-b||2 Recall: ||D(a-b)||=||a-b|| Communication = O~(L) = O~(1/ 2)

Other Results • Use homomorphic encryption tricks to get better upper bounds for private nearest neighbor and private all-pairs nearest neighbors. • Define private approximate nearest neighbor problem: – Requires a new definition of private approximations for functionalities that can return sets of values. – Achieve small communication in this setting.