Fast Searchable Encryption with Tunable Locality Ioannis Demertzis
Fast Searchable Encryption with Tunable Locality Ioannis Demertzis University of Maryland yannis@umd. edu Charalampos Papamanthou University of Maryland cpap@umd. edu
Cloud Computing Pros: + Near infinite scalability for big data analytics + Easy and ubiquitous access on solid data + Cost reduction with the use of shared infrastructure + Affordable for small and medium businesses Cons: - Serious security and privacy concerns regarding outsourcing and querying on private company or personal data Solution: Privacy Preserving DBMS
Obstacles to Overcome (2009 -> 2015 -> 2017) Gartner says worldwide Cloud Services Market is forecast to reach $383 Billions in 2020
IDEAL SOLUTION Privacy Preserving DBMS Encrypt(DB) Client Later: Client Encrypted(query) Encrypted(results) ? Encrypted Database Untrusted Cloud
Solutions for Encrypted Search Demertzis, Papadopoulos, Papapetrou, Deligiannakis, Garofalakis “Practical Private Range Search Revisited”, SIGMOD 2016 High … Crypt. DB Cipher. Base OPE DET MONOMI Secure & Efficient SSE Google Big. Query Efficiency Microsoft SQL 2016 Always Encrypted Efficient Functional Enc Oblivious RAM FHE Low Secure Security High Not all points are explained in depth (Feel free to ask me during the poster session!!)
Our Contribution In this work: § A new scalable Searchable Encryption (SE) with good locality § 12 x more efficient than the state-of-the-art in memory SE § Up to 2 -3 orders of magnitude less false positives than the external memory SE § Space, Read Efficiency, Locality, Parallelism, Bandwidth can be tuned to achieve optimal performance § Formal proof based on widely-adopted CRYPTO security definitions
What is Searchable Encryption? Client Leakage is the amount of information that the untrusted cloud learns search query: keyword ? Untrusted Cloud
Searchable Encryption (SE) schemes Client Untrusted Cloud k 1 F 4 F 2 k 2 F 3 F 6 F 4 k 3 F 5 F 1 F 2 F 3 F 2 F 4 F 5 F 6
Searchable Encryption (SE) schemes Client Untrusted Cloud k 1 F 4 F 2 k 2 F 3 F 6 F 4 k 3 F 5 F 1 F 2 F 3 F 2 F 4 F 5 F 6
Searchable Encryption (SE) schemes Client L 1 leakage: total leakage prior to query execution e. g. size of each encrypted file, size of encrypted index k 1 F 4 F 2 k 2 F 3 F 6 F 4 k 3 F 5 F 1 F 2 F 3 F 2 F 4 F 5 F 6 Untrusted Cloud
Searchable Encryption (SE) schemes Client Search pattern: whether a search query is repeated Untrusted Cloud L 2 leakage (leakage during query execution) token k 1 Access pattern: encrypted document ids and files that satisfy the search query PRFsk() k 1 F 4 F 2 k 2 F 3 F 6 F 4 k 3 F 5 F 1 F 2 F 3 F 2 F 4 F 5 F 6
Searchable Encryption (SE) schemes Client Search pattern: whether a search query is repeated Untrusted Cloud L 2 leakage (leakage during query execution) token k 1 Result size k 1 F 4 F 2 k 2 F 3 F 6 F 4 k 3 F 5 F 1 F 2 T 1 John Smith CMU 27 $3, 000 T 2 Alice Lu UCLA 28 $4, 000 TN Bruce William UMD 30 $2, 000
Searchable Encryption – Locality and Read Efficiency Locality: #non-continues reads for each query. Read Efficiency: #memory locations per result item. Pi. Bas locality = 3 & read efficiency = 1 k 1 X : false positives F 1 F X F X 4 5 3 F 4 X X F 2 X F 1 F 2 F 4 F 6 F 2 locality = 1 & read efficiency = O(N)
Searchable Encryption – Lower Bound “Cash and Tessaro Eurocrypt 2014” O(1) Locality and O(1) Read Efficiency requires ω(Ν) space <=3 <=4 F 1 F 4 F 2 F 5 F 1 F 3 F 6 F 4 F 2 locality = 1 & read efficiency = 1 Having k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space
Security Game Real Scheme Simulator L 1 ( Enc ( ) + Enc( ) &^*@h@&*^H 4&*24 w 1 | L 2( w 1 ) w 1 token 1 … w. N token. N ) Adversary ^&*da. UY@#* … w. N | L 2( w. N) &k*&()#&*@
Searchable Encryption - Related Work Scheme Locality Read Efficiency Space 1 st Generation of SE schemes - Pi. Bas Θ(|result|) O(1) Ο(N) Asharov et al. STOC 2016 – Scheme Nlog. N O(1) Ο(Nlog. N) Asharov et al. STOC 2016 – One. Choice. Alloc O(1) Θ(log. N loglog. N) Ο(Ν) Our scheme with optimal locality O(1) O(N 1/(s+1)) O(s. N) Our scheme with O(L) Locality O(L) O(N 1/s/L) O(s. N) Our scheme with O(R) Read Efficiency O(N 1/s/R) O(s. N) Cash et al. EUROCRYPT 2014 - Lower bound: O(1) ω(Ν)
Asharov et al. STOC 2016 – One. Choice. Alloc Scheme k 1= k 2= k 3= … 3 log. N loglog. N M = N / log. N loglog. N O(N) space, O(1) locality and Θ(logn loglog. N) read efficiency
Asharov et al. STOC 2016 – One. Choice. Alloc Scheme k 1= k 2= k 3= k 1 … 3 log. N loglog. N M = N / log. N loglog. N O(N) space, O(1) locality and Θ(logn loglog. N) read efficiency
Optimal Locality Scheme and Read Efficiency k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space Dataset: N=16 log. N+1 encrypted arrays |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 O(Nlog. N) space, O(1) locality and O(1) read efficiency
Optimal Locality Scheme and Read Efficiency k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space Dataset: N=16 log. N+1 encrypted arrays |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 O(Nlog. N) space, O(1) locality and O(1) read efficiency
Optimal Locality Scheme and Read Efficiency k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space Dataset: N=16 log. N+1 encrypted arrays |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 O(Nlog. N) space, O(1) locality and O(1) read efficiency
Optimal Locality Scheme and Read Efficiency k distinct result sizes: O(1) Locality and O(1) Read Efficiency requires O(kΝ) space log. N+1 encrypted arrays |k|=16 Dataset: N=16 |k|=8 |k|=4 |k|=2 |k|=1 Level i has N/2 i buckets with size 2 i O(Nlog. N) space, O(1) locality and O(1) read efficiency
Optimal Locality Scheme and Read Efficiency Input dataset, N=16 log. N+1 encrypted arrays k 1= |k|=16 k 2= |k|=8 k 3= |k|=4 k 4= |k|=2 k 5= |k|=1 O(Nlog. N) space, O(1) locality and O(1) read efficiency
Our Approach – Optimal Locality Scheme O(s. N) Keep only s=3 encrypted arrays |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 O(s. N) space, O(1) locality and O(N 1/s) read efficiency
Our Approach – Optimal Locality Scheme Keep only s=3 encrypted arrays |k|=16 |k|=8 |k|=4 |k|=2 Not stored |k|=1 Stored but empty O(s. N) space, O(1) locality and O(N 1/s) read efficiency
Our Approach – Optimal Locality Scheme Keep only s=3 encrypted arrays |k|=16 |k|=8 |k|=4 |k|=2 Not stored |k|=1 Stored but empty O(s. N) space, O(1) locality and O(N 1/s) read efficiency
Our Approach – Optimal Locality Scheme Keep only s=1 encrypted arrays Read Efficiency |k|=16 |k|=8 1 |k|=4 |k|=2 Not stored |k|=1 Stored but empty O(s. N) space, O(1) locality and O(N 1/s) read efficiency
Our Approach – Optimal Locality Scheme Keep only s=1 encrypted arrays Read Efficiency |k|=16 1 |k|=8 2 |k|=4 |k|=2 Not stored |k|=1 Stored but empty O(s. N) space, O(1) locality and O(N 1/s) read efficiency
Our Approach – Optimal Locality Scheme Keep only s=1 encrypted arrays Read Efficiency |k|=16 1 |k|=8 2 |k|=4 4 |k|=2 Not stored |k|=1 Stored but empty O(s. N) space, O(1) locality and O(N 1/s) read efficiency
Our Approach – Optimal Locality Scheme Keep only s=1 encrypted arrays Read Efficiency |k|=16 1 |k|=8 2 |k|=4 4 |k|=2 8 |k|=1 Not stored Stored but empty O(s. N) space, O(1) locality and O(N 1/s) read efficiency
Our Approach – Optimal Locality Scheme Keep only s=1 encrypted arrays Read Efficiency |k|=16 1 |k|=8 2 |k|=4 4 |k|=2 8 |k|=1 Each stored level requires 2*N + 2 i space to avoid potential overflows O(s. N) space, O(1) locality and O(N 1/s) read efficiency
Our Approach – Optimal Locality Scheme Keep only s=3 encrypted arrays |k|=16 O(log. N/s) |k|=8 |k|=4 |k|=2 O(log. N/s) |k|=1 s evenly distributed levels Maximum gap between stored levels is O(log. N/s) The worst case read efficiency is O(2 log. N/s) = O(N 1/s) O(s. N) space, O(1) locality and O(N 1/s) read efficiency
Our Approach – Optimal Read Efficiency Keep only s=3 encrypted arrays |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 O(N) space, O(N 1/s) locality and O(1) read efficiency
Our Approach – Optimal Read Efficiency Keep only s=3 encrypted arrays |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 O(N) space, O(N 1/s) locality and O(1) read efficiency
Our Approach – Constant Locality O(L) Keep only s=1 encrypted array and tune L=4 |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 O(N) space, O(L) locality and O(N 1/s/L) read efficiency
Our Approach – Constant Locality O(L) Keep only s=1 encrypted array and tune L=4 |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 O(N) space, O(L) locality and O(N 1/s/L) read efficiency
Our Approach – Constant Locality O(L) Keep only s=1 encrypted array and tune L=4 |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 O(N) space, O(L) locality and O(N 1/s/L) read efficiency
Our Approach – Constant Locality O(L) Keep only s=1 encrypted array and tune L=4 |k|=16 |k|=8 |k|=4 |k|=2 |k|=1 Choose L = #parallel process units (servers) O(N) space, O(L) locality and O(N 1/s/L) read efficiency
Our Approach – The full protocol Client k 3 Client filters out the false positives Server filters out the false positives Untrusted Cloud Minimize the bandwidth #PRFs = |result|*N 1/s level=2, offset=0 Only 2 PRFs Encrypted Dictionary More bandwidth Encrypted Arrays
Experiments • 1 real dataset with 6, 123, 276 records used for in-memory evaluation • Query attribute: location description (173 distinct keywords) • Synthetic dataset used for external memory evaluation • N =247 -1 records (~ 1 petabyte) , |k| =1, 2, 4, …, 246 • Java implementation: • Our scheme • Pi. Bas, state-of-the-art for in-memory settings • One. Choice. Alloc, state-of-the-art for external memory • 64 bit machine with Intel Xeon E 5 -2676 v 3 with 64 GB RAM
Experiments – Index Costs (In-memory)
Experiments – Search Costs (In-memory) 12 x End-to-End Search Time 340 x
Experiments – False Positives 86 x 580 x False Positives for Different Sizes
Experiments – Search Time (External Memory) s=4 86 x 580 x
Experiments – Search Time (Real Dataset)
Conclusion – Future Work ? ______ In this work: § Formal proof based on widely-adopted CRYPTO security definitions § 12 x more efficient than the state-of-the-art in memory SE § Up to 2 -3 orders of magnitude less false positives than the external memory SE § Our scheme provides various trade-offs between § Space § Read Efficiency (false positives) § Locality § Parallelism § Bandwidth § #Crypto operations
Thank you!!! Questions? ? ? 12 x in-memory 580 x external memory Tunable for arbitrary architectures High PPE OPE DET SSE Efficiency Efficient Func/Pred Enc Secure Low Secure & Efficient Security ORAM FHE High
- Slides: 48