Storage Encryption A Cryptographers View Shai Halevi IBM
Storage Encryption: A Cryptographer’s View Shai Halevi IBM Research 11/2/2020 SCN 2008
Motivation n “You’re working on storage encryption? It must be the most boring thing in the world…” Anonymous n Encryption is the most basic task in crypto q We know what secure encryption means n q We have provably-secure schemes n n CCA-security, Authenticated encryption, … Even efficient ones What is left to research? 11/2/2020 SCN 2008
Cryptographically interesting problems with storage encryption n Choosing the encryption scheme q n Managing keys and nonces q n “Transparent” vs. authenticated encryption Avoiding nonce re-use, wrapping keys, … Outside the model q 11/2/2020 Circular encryption SCN 2008
Typical Storage Architecture Clients n Network Storage devices My focus: encryption at the network/devices q q 11/2/2020 Several suchbelongs high-profile Typical threat model: data to the incidents in 2005 organization, encrypted to prevent unauthorized disclosure / modification E. g. , encrypting tapes, lest they fall off the truck SCN 2008
Two Types of Encryption n “Transparent” (length-preserving) q q n Used to add encryption to existing data-paths E. g. , software hard-disk encryption, or a bump-in-a-wire encryption box Authenticated (length-increasing) q q 11/2/2020 Used when the “storage medium” allows records of flexible-length E. g. , tape encryption, client-side encryption, etc. SCN 2008
Transparent encryption Storage units (“sectors”): just a pile of bits, no semantics (partially) trusted / untrusted Clients Inputs: keys, plaintext, location in storage Output: ciphertext 11/2/2020 Encryption Module Key Mngmnt SCN 2008 Storage
Inherent limitations Random access Each “sector” encrypted separately Can mix and match q q Þ C 1 C 2 … Cm is encryption of P 1 P 2 … Pm C 1’ C 2’ …Cm’ is encryption of P 1’ P 2’… Pm’ C 1 C 2’ …Cm is encryption of P 1 P 2’ … Pm Length preserving Deterministic When re-encrypting a file, we can see what sectors have changed Length preserving No authentication Any ciphertext sector is decrypted as “something” 11/2/2020 SCN 2008
The best we can do: Tweakable Encryption [LRW 02] n Enciphering/deciphering routines: ciphertext = E(key, tweak, plaintext), plaintext = D(key, tweak, ciphertext) q q q n ciphertext-length = plaintext-length key is fixed and secret tweak is arbitrary (even adversarially chosen) Should look like q q 11/2/2020 A block cipher with block-size = plaintext-length Different tweaks look like independent keys SCN 2008
Narrow vs. Wide Blocks n Narrow-blocks q n Wide-block q q n Each 16 -byte block is encrypted separately (think ECB) The entire sector is encrypted together Change anywhere effect entire ciphertext Quantitative, not qualitative difference q 11/2/2020 They are the same if you use 16 -byte sectors SCN 2008
Some Wide-Block Modes 11/2/2020 SCN 2008
CMC [HR 03] P 1 T P 2 P 3 P 4 n n EK’ EK PPP 1 M EK PPP 2 M CCC 4 11/2/2020 EK PPP 3 M CCC 3 EK n PPP 4 q M CCC 2 EK = AES with key K T – tweak M = 2(PPP 1 PPP 4) = 2(CCC 1 CCC 4) CCC 1 EK EK C 4 C 3 C 2 C 1 SCN 2008 Mult. In GF(2128)
EME [HR 04] n n EK = AES with key K L = another key T = tweak P 1 L 2 L EK T SPPP M = MP MC P 2 EME* [H 04] is an extension for sectors longer than 2 KB 11/2/2020 EK PPP 3 PPP 4 MP 2 M 4 M 8 M MC CCC 2 EK L CCC 3 EK 2 L C 1 SCN 2008 8 L EK PPP 2 CCC 1 n P 4 4 L EK PPP 1 EK T SCCC P 3 EK 4 L C 2 CCC 4 EK 8 L C 3 C 4
XCB [MF 04] A B HCTR [WFW 05] T A B T HCH [CS 06] A B T, len PRF E Hash CTR Hash E CTR E Hash E 11/2/2020 Hash SCN 2008 E CTR Hash x 2
Naor-Reingold Modes: TET [H 07], HEH [S 07] p 1 p 2 pm-1 pm Invertible “universal hashing” ECB encryption Invertible “universal hashing” c 1 n c 2 cm-1 cm “Universal hashing” ensures no collisions in the input to the ECB layer 11/2/2020 SCN 2008
Microsoft Bit. Locker [F 06] n Not quite an AES mode of operation Ad-hoc “mixing” EK’ n EK EK “Block-cipher-like” mixing q 11/2/2020 Detailed analysis of resistance to attacks, but no reduction to the security of AES SCN 2008
Some Narrow-Block Modes 11/2/2020 SCN 2008
LRW Mode [LRW 02] P 1 n EK - AES with key K L - another key L T in GF(2 n) n A handy optimization: n n q q n L T P 2 L (T+1) EK EK C 1 C 2 Think about using tweaks T, T+1, T+2, … Once L T is computed, easy to compute L (T+1), L (T+2), … IEEE 1619 intended to standardize this mode 11/2/2020 SCN 2008
What’s Wrong with LRW? n Fails when “encrypting its own key” L L (T+1) L T L (T+1) EK X C 1= X + L T n Extract L = C 1 -C 2 11/2/2020 0 SCN 2008 EK C 2= X + L (T+1) (? )
Is This a Problem in Practice? n n Lively argument in the 1619 mailing list q “No one in their right mind will ever do that” Turns out that “encrypting own key” can happen, e. g. , in Windows Vista™ q q n A driver does sector-level encryption On hibernate, driver itself stored to disk So a different mode (based on Rogaway’s XEX) was chosen for the standard 11/2/2020 SCN 2008
XTS Mode [Ro 04] T n Tweak is (T, i) q q n T*=EK’(T), Ti*=2 i T* C = Ti* EK(P Ti*) T* 2 T* P 2 4 T* q EK EK C 1 C 2 C 3 (T, 0), (T, 1), (T, 2), … for sequential blocks We’ll About as efficient as LRWtalk later about circular security The attack from before does not work q 11/2/2020 How do we know that there aren’t other attacks in this vein? SCN 2008 P 3 EK Similar handy optimization q n EK’ T* P 1
Remaining problems n Narrow vs. wide-block in practice q q Wide-block is 2 -3 times more expensive Limit attacker to more coarse granularity n q n Traffic-analysis/malleability of whole sectors, rather than each 16 -byte block Does this add security in practice? Security beyond the birthday bound q 11/2/2020 With big disk-arrays in the petabytes, q 2/2128 may get too close for comfort SCN 2008
Authenticated Encryption n Each record is stored with a nonce (IV), and an authentication tag q q n Enc. K(P) = <IV, C, tag> Dec. K(IV, C, tag) = P / fail IVs must be “fresh” q 11/2/2020 Encrypting the same plaintext twice results in a different ciphertexts SCN 2008
Many “standard” Encryption Modes n Two-Pass Modes q Encrypt-then-authenticate (e. g. , GCM [MV 05]) n n q Authenticate-then-encrypt (e. g. , CCM [WHF 03]) n n Choose IV, C=EK(IV, P), tag=MACK’(IV, C) E: AES-based encryption, MAC: HMAC or others Choose IV, t=MACK’(IV, P), C=EK(IV, P, t) One-Pass Modes (IAPM [J 01], OCB [R 01], …) q q q 11/2/2020 Compute CTXT & MAC together, more efficient None is used in practice today Due to patent issues SCN 2008
Whence Cometh thy Nonce? n Re-using the same (key, IV) pair to encrypt different records is a security violation q Especially in schemes based on CTR mode n q Especially 2 in GCM mode n n Re-using (key, IV) is the same as two-time-pad Re-using (key, IV) may leak the authentication key Avoiding nonce re-use may be tricky 11/2/2020 SCN 2008
Common Tape-Encryption Setting n n n Key Same key can be Mngmnt served to several encryption modules They must avoid using the same (key, IV) pair Without much coordination Clients 11/2/2020 SCN 2008 Keys Encryption Module Data Encryption Module tapes
Random Nonces? n Some modes have 96 -bit nonces (GCM) q n Is this enough? How many times can the same key be served? What if you use just one key for all your corporate tapes? 11/2/2020 SCN 2008
Systematic Nonces? n E. g. , use the module serial # in the nonce q q q Reduces the IV space further Sensitive to mis-configuration Module must remember “the current nonce” n n Through reset, power-failures, crashes, … Using encryption modules from several different manufacturers? q 11/2/2020 An organization may have two drives from IBM, one from HP, one from SUN, etc. SCN 2008
Better: Wrapped Keys n The served key (from key-management) is only used as a key-encrypting-key (KEK) q q q n Module generates a “fresh” data key (DK) Use KEK to encrypt DK, store ciphertext on tape Use DK to encrypt data David Wheeler: All problems in computer science can be solved by another level of indirection… … but that usually will create another problem. 11/2/2020 SCN 2008
How to Wrap Keys? n Using standard encryption (symmetric/pkey) q n Using “deterministic encryption” q n Need to worry again about fresh IVs / randomness E. g. , ANS X 9. 102 draft standard [RS 06]: Deterministic Authenticated Encryption q Essentially “the strongest security possible with deterministic encryption” n q Similar to strong PRP, but need not be a bijection SIV mode: IV = PRFk 1(DK), C = CTRk 2(IV, DK) 11/2/2020 SCN 2008
More on Key-Wrapping [GH 08] n Some “secure schemes” are not DAE q n Secure key-wrap is just like secure encryption, except the plaintext is random q n DAE an overkill for wrapping encryption keys Rather than adversarially chosen Hash-then-Encrypt: “SIV-like” constructions q q q 11/2/2020 IV = Hash(DK), C = ENC(IV, DK) Hash either keyed or not ENC any “standard encryption mode” SCN 2008
Hash-then-Encrypt Hash XOR Linear Universal 2 nd preimage Encrypt CTR ECB * CBC ? Masked ECB/CBC XEX 11/2/2020 SCN 2008 *
Remaining Problems n Authenticated Encryption does not solve: q q n “Replay attacks: ” replace current record on medium with a previous one Re-ordering of records No good crypto solutions to either problem q q 11/2/2020 Merkel trees work, but they are too expensive Not clear that one can do better [DNRV 08] SCN 2008
Back to “Key-Dependent Security” n Adversary sees encryptions of the secret key q Maybe even some functions of this key How to define security in this case? n How to achieve it? Aside: n The definitional issue was noted already in [GM 84], but explicitly scoped out n [CL 01] had a “key-dependent-secure” public-key encryption in the ROM n 11/2/2020 SCN 2008
[BRS 01] Definitions n Start from the “usual notions” q 1 k Answerk(q 1) q 2 Answerk(q 2) … g Answerk(g(k)) n Let the attacker specify a function of the key 11/2/2020 SCN 2008
[BRS 01] Construction n Textbook scheme: Enck(m) = <r, fk(r) m> With fk(x) = H(k|x) and H a random oracle, this is “key-dependent-secure” As usual: in lieu of a true random oracle, we can use, e. g. , SHA 1 q q 11/2/2020 fk(x) = SHA 1 -Compression(IV=k, M=x) This should be safe… SCN 2008
[HK 07] Insecurity in Standard Model n SHA 1 follows the Davis-Meyer approach q q q Roughly Compression(IV, M) = EM(IV) IV E is a “block cipher” (easily invertible given M) SHA 1 actually uses + rather than n n But we will ignore that fact We get Enck(m) = <r, Er(k) k m> q q 11/2/2020 In particular Enck(k) = <r, Er(k) k k> Given <r, c> recover k = Er-1(c) SCN 2008
Key-dependent security w/o ROM? n n n [HH’ 08]: Unlikely from “general assumptions” [BHHO’ 08]: But possible from DDH Think El. Gamal Encryption: q q pk=(v, w=va), sk=a, Encpk(m)=<vr, m wr> So Encpk(sk)=<vr, a var> n n Security unlikely to follow from DDH What if we use sk=ua (u v)? q 11/2/2020 We get security from DDH, but cannot decrypt… SCN 2008
Decrypting with “sk in the exponent”? n Use single bits in the exponent for secret key q n Can recover b from vb pk = (v 1 v 2 … vm w=P vibi) sk = (ub 1 ub 2 … ubm) Encpk(m) = (v 1 r v 2 r … vmr m wr ) q So Encpk(ubi) = (v 1 r v 2 r … vmr ubi wr ) Thm: This is CPA-secure against encryptions of any affine function of the secret key q 11/2/2020 [CCS 08] build on this to get CCA-security SCN 2008
Morals to take away n Applying crypto to real-world systems is fun q n n n Can even find interesting questions to look at 1 st law of commercial crypto: “cryptosystems will be (ab)used beyond their security model” We still do not know everything there is to know about encryption Storage encryption is (a little) special q 11/2/2020 Mostly: harder to get synchronization between encryptor and decryptor SCN 2008
Thank you 11/2/2020 SCN 2008
- Slides: 40