Searchable Encryption with Optimal Locality Achieving Sublogarithmic Read

Searchable Encryption with Optimal Locality: Achieving Sublogarithmic Read Efficiency Ioannis Demertzis Dimitris Papadopoulos yannis@umd. edu dipapado@cse. ust. hk University of Maryland Hong Kong UST Charalampos Papamanthou University of Maryland cpap@umd. edu

What is Searchable Encryption (SE)? ? Client + Search pattern: whether a search query is repeated search query: Access pattern: encrypted document ids and files that satisfy the search query Untrusted Cloud keyword Setup leakage: total leakage prior to query execution e. g. size of each encrypted file, size of encrypted index Security (informal): The adversary does not learn anything beyond the above leakages! 2

Searchable Encryption – Locality and Read Efficiency Scalable SE requires low locality and read efficiency Locality is an important efficiency dimension ([CJS+14], [DP 17], … Locality: #non-continuous reads for each query Read Efficiency: #memory locations per result item locality = 3 & read efficiency = 1 search query: keyword X : false positives id 1 X X X id 4 X X id 2 X id 4 id 5 id 3 id 1 id 2 id 4 id 6 id 2 locality = 1 & read efficiency = O(N) 3

*under some assumptions for the SE scheme Previous Works & Our Result “Cash and Tessaro Eurocrypt 2014” * Locality (L): O(1) and Read Efficiency (R): O(1) requires Space (S): ω(Ν) General Schemes [ANS+16] – Nlog. N scheme L: O(1), R: O(1), S: O(Nlog. N) [DP 17] – Read. Opt L: O(N 1/s), R: O(1), S: O(s. N) [ANS+16] – One. Choice. Alloc ~ L: O(1), R: O(log. N), S: O(N) Schemes with limitation on the maximum keyword-list size [ANS+16] – Two. Choice. Alloc * ~ L: O(1), R: O(loglog. N), S: O(N) * keyword lists in the dataset have size less than N 1 -1/loglog. N. [ASS 18]** L: O(1), R: O(ω(1)*ε-1(n) + logloglog. N) for n = N 1 -ε(N), S: O(N) ** keyword lists in the dataset have size less than N/log 3 N Our Approach L: O(1), R: O(logγN), S: O(N), for γ>2/3 4

Searchable Encryption – Naïve Approach 1 k 1= k 2= k 3= <=3 <=4 locality = 1 & read efficiency = 1 & optimal space 5

Searchable Encryption – Naïve Approach 2 k 1= k 2= k 3= ? 6

[ANS+16]– One. Choice. Allocation ~ O(N) space, O(1) locality and O(log. N) read efficiency k 1= k 2= k 1 3 log. N loglog. N k 3= … M = N / log. N loglog. N 7

[ANS+16]– Two. Choice. Allocation ~ O(N) space, O(1) locality and O(loglog. N) read efficiency k 1= k 2= k 3= ** Assuming all the keyword lists in the dataset have size less than N 1 -1/loglog. N ** c loglog. N log 2 loglog. N … M = N / loglog. N log 2 loglog. N 7

[ANS+16]– Two. Choice. Allocation ~ O(N) space, O(1) locality and O(loglog. N) read efficiency k 2= k 1= k 3= ** Assuming all the keyword lists in the dataset have size less than N 1 -1/loglog. N ** k 3 c loglog. N log 2 loglog. N … M = N / loglog. N log 2 loglog. N 7

Our Approach O(N) space, O(1) locality and O(logγN), for γ>2/3 Read Efficiency ~ Ο(logΝ) ~ O(loglog. N) [ANS+16]-One. Choice. Alloc [ANS+16] Two. Choice. Alloc N 1 -1/loglog. N N Keyword-list size 8

Our Approach O(N) space, O(1) locality and O(logγN), for γ>2/3 Read Efficiency ~ Ο(logΝ) [ANS+16]-One. Choice. Alloc Ο(logγΝ) ~ O(loglog. N) [ANS+16] Two. Choice. Alloc N 1 -1/loglog. N 1 -γ N 1 -1/log N N Keyword-list size 8

Our Approach O(N) space, O(1) locality and O(logγN), for γ>2/3 Read Efficiency ~ Ο(logΝ) Ο(logγΝ) ~ O(loglog. N) [ANS+16]-One. Choice. Alloc Small Huge [ANS+16] Two. Choice. Alloc Sequential Scan N 1 -1/loglog. N 1 -γ γ N 1 -1/log N N/log N Keyword-list size N 8

Our Approach O(N) space, O(1) locality and O(logγN), for γ>2/3 Read Efficiency ~ Ο(logΝ) Ο(logγΝ) ~ O(loglog. N) [ANS+16]-One. Choice. Alloc Small Focus of this talk! Large Medium [ANS+16] Two. Choice. Alloc N 1 -1/loglog. N Huge Sequential Scan Multi-level keyword-size compression 1 -γ N 1 -1/log N 2 N/log N Keyword-list size γ N/log N N 8

Starting Point: Offline Two Choice Allocation (OTA) – [SEK 03] Offline. Two. Choice. Alloc for m balls and n bins: Max. Flow ( … ) n bins 9

Starting Point: Offline Two Choice Allocation (OTA) – [SEK 03] Offline. Two. Choice. Alloc for m balls and n bins: Key IDEA: One OTA per size and then Merge!! with probability at least 1 – O(1/n) Max load <= Γm/n + 1 … n bins 10 L

Our Approach: OTA per size + Merge … ks: #keyword lists with size s bs=M/s (#superbuckets) A 2 s As … … … Overflow Probability = O(1/bs) See Lemma 4 in our paper Σs(Γk /b s … s + 1) = O(N/M + logγΝ) M = Ν/logγΝ = M L A 4 s Ο(logγΝ) ? 11

New analysis for OTA Our Approach: Accessing keyword lists **Novel analysis for OTA** The probability that more than O(log 2 N) lists of size s overflow is negligible! – see Lemma 5 in our paper … … k 3 A 4 s A 2 s As … … … M … … … Stashes B 4 s B 2 s Bs Ο(logγΝ) 12

Our Approach: New locality-aware ORAM Ο(n 1/3 log 2 n) Bandwidth and O(1) Locality We need an ORAM with the following properties: 1. O(1) locality, existing ORAMs with polylogn bandwidth have logn locality 2. Zero failure probability, since it will be applied on only log 2 n elements 3. o(√n) bandwidth, in order to achieve sublogarithmic read efficiency o(√ log 2 n) = o(logn) Α n + n 2/3 Β C n 2/3 + n 1/3 πα: [nα] Square Root ORAM πb: [nb] Hierarchical ORAM *1/3 n De-amortization techniques from Goodrich et al. [GMO+11] 13

Our Approach: OTA Stashes … Amax … … … A 2 s As … … M …B max … … A 4 s Important: max ≤ N/log 2 N for maintaining O(N) index size … … Stashes B 4 s B 2 s Bs 14

Conclusion – Future Work? Locality-aware Dynamic SE Read Efficiency Closer to the lower bound ~ Ο(logΝ) Ο(logγΝ) ~ O(loglog. N) Open Question: Closer to the lower bound? Ο(n 1/3 log 2 n [ANS+16]-One. Choice. Alloc Small New ORAM: ) bandwidth, O(1) locality Medium OTA-based approach [ANS+16] Two. Choice. Alloc New probability bounds for OTA N 1 -1/loglog. N 1 -γ N 1 -1/log N Large Huge Multi-level keyword-size compression Sequential Scan 2 N/log N Keyword-list size γ N/log N N 15

Thank You! https: //eprint. iacr. org/2017/749 Read Efficiency Closer to the lower bound ~ Ο(logΝ) Ο(logγΝ) ~ O(loglog. N) [ANS+16]-One. Choice. Alloc Small New ORAM: Ο(n 1/3 log 2 n ) bandwidth, O(1) locality Medium OTA-based approach [ANS+16] Two. Choice. Alloc New probability bounds for OTA N 1 -1/loglog. N 1 -γ N 1 -1/log N Large Huge Multi-level keyword-size compression Sequential Scan 2 N/log N Keyword-list size γ N/log N N

[ASS 18] in CRYPTO O(N) space, O(1) locality and ω(1)⋅ϵ(n)− 1+O(logloglog. N) read efficiency where n = N 1 -ϵ(n) Read Efficiency Ο(logΝ) Ο(logΝ/loglog. N) Ο(logγΝ) ~ O(loglog. N) O(logloglog. N) [ANS+16]-One. Choice. Alloc Small Large Medium Huge [ANS+16] Two. Choice. Alloc N 1 -1/loglog. N 1 -γ N 1 -1/log N 3 2 N/log N Keyword-list size γ N/log N N

Studying locality for HDD Access Cost = (seek time) + (rotational delay) + (transfer time) Random I/O Cost ~4 -12 ms Sequential I/O Cost ~10 μs for 1 byte

Studying locality for SDD Samsung 960 Pro M. 2 NVMe SSD Read Write Locality Sequential Transfer Page size = 2 MB 2222. 93 MB/sec 1786. 72 MB/sec High Random Transfer Page size = 2 MB 1339. 76 MB/sec 1237. 57 MB/sec Random Transfer Page size = 2 KB 34. 30 MB/sec 150. 83 MB/sec Low More detailed analysis http: //www. storagereview. com/samsung_960_pro_m 2_nvme_ssd_review

Studying locality for RAM Untrusted Cloud Client keyword search query: Tw keyword Tw 1 Tw 2 Tw 3 id 1 id 4 id 2 id 4 id 5 id 3 id 1 id 2 id 4 id 6 id 2 Tw search query: keyword Tw id 1 id 5 id 1 id 4 id 2 id 4 id 3 id 2 id 6