Reducing Sandbox Overhead in Web Assembly Dimitar Bounov

Reducing Sandbox Overhead in Web. Assembly Dimitar Bounov

What is Web. Assembly (WASM)? A new low-level language for the web (coming Q 1 2017) Collaborative effort b/w Mozilla, Google, Microsoft, Apple Efficient target for C, C++ (upcoming - Java, Rust…)

In Firefox Aurora now

Why another language? Need for dynamic content on web (long live Flash…) Na. Cl (2009 -now) LLVM IR + SFI Asm. JS (2012 -now) Restricted JS subset And the winner is. . . Facebook source (10/2016):

Why not Asm. JS? Complex JS Semantics int fib (int n) { if (n < 3) return 1; return fib(n-1) + fib(n-2); }

Why not Asm. JS? Complex JS Semantics function fib(n) { n = n|0; Cast to int if (n >>> 0 < Cast 3) to positive int return 1|0; return (fib((n-1)|0) + fib((n-2)|0))|0; }

Why not Asm. JS? Complex JS Semantics Large Code Size Slower Compilation

WASM Design RISC-like virtual ISA (<255 opcodes) Structured control flow (blocks, if/else, loops, break) Separate stack and heap (enforced) - stack non-aliasable - stack frame size statically known Deterministic* semantics across x 86, x 64, ARM

WASM Example: Fibonacci Function Signature Locals (func $fib (param $0 i 32) (result i 32) (local $1 i 32) (if i 32 (i 32. lt_s $1 3) 1 (block if is an expression (set_local $1 (call fib (i 32. add $0 -1))) block is an expression (i 32. add $1 (call $fib (i 32. add $0 -2))))))

WASM Sandbox Control Flow - No computed jump instruction - Indirect function calls go through lookup table - Returns safe due to separate stack Memory Accesses - Heap Loads/Stores bounds checked (BC-ed)

WASM Sandbox Control Flow - No computed jump instruction - Indirect function calls go through lookup table - Returns safe due to separate stack Memory Accesses Safe stack makes many BCs redundant - Heap Loads/Stores bounds checked (BC-ed)

Memory Accesses base + addr uint 32_t addr = … HEAP base int x = *(base + addr); base+L

Bounds Checks base + addr uint 32_t addr = … HEAP assert(0 <= addr <= L); // BC base+L int x = *(base + addr); base, addr safe due to separate stack . . . assert(0 <= addr <= L); // BC Redundant! int y = *(base + addr); // Same Address *Global Value Numbering could handle this too. 10% of dynamic BCs on Angry. Bots

Bounds Checks - Constants Redundant! assert(0 <= 0 x. ABCD <= L); // BC int x = *(0 x. ABCD); // Constant Address ~22% of dynamic BCs on Angry. Bots *Global Value Numbering should have handled this.

Bounds Checks - Constant Offsets uint 32_t addr = … assert(0 <= addr+8 <= L); // BC int x = *(base + addr + 8); // Constant Offset. . . assert(0 <= addr+4 <= L); // BC Redundant*! int y = *(base + addr + 4); // Smaller Offset *ABCD (Bodik et. al) can handle this ~35% of dynamic BCs on Angry. Bots

Bounds Checks - Constant Offsets 2 uint 32_t addr = … assert(0 <= addr+4 <= L); // BC int x = *(base + addr + 4); // Constant Offset. . . assert(0 <= addr+8 <= L); // BC Not Quite Redundant : ( int y = *(base + addr + 8); // Bigger Offset

Bounds Checks + Slop base + addr+4 base + addr+8 uint 32_t addr = … assert(0 <= addr+4 <= L); // BC HEAP base int x = *(base + addr + 4); // Constant Offset. . . assert(0 <= addr+8 <= L); // BC int y = *(base + addr + 8); // Bigger Offset SLOP base+L+S

Bounds Checks + Slop base + addr+4 base + addr+8 uint 32_t addr = … HEAP assert(0 <= addr <= L); // BC base SLOP base+L+S int x = *(base + addr + 4); // Constant Offset. . . assert(0 <= addr <= L); // BC Redundant! int y = *(base + addr + 8); // Bigger Offset ~50% of dynamic BCs on Angry. Bots

Bounds Checks + Loops* uint 32_t addr = … constant stride for(; addr < end; addr++) assert(0 <= addr <= L); (*base + addr) = 0; Constant Stride + Slop Allow BC Hoisting * ongoing work

Bounds Checks + Loops* base + addr uint 32_t addr = … assert(0 <= addr <= L); HEAP base SLOP base+L+S for(; addr < end; addr++) (*base + addr) = 0; Constant Stride + Slop Allow BC Hoisting * ongoing work

Bounds Checks + Growing Memory …. HEAP assert(0 <= addr <= L); grow_memory() assert(0 <= addr <= L); base SLOP base+L+S

Bounds Checks + Growing Memory base+L 1 …. HEAP assert(0 <= addr <= L); grow_memory() assert(0 <= addr <= L); base SLOP base+L+S

Bounds Checks + Growing Memory base+L 1 …. HEAP assert(0 <= addr <= L 1); base SLOP base+L+S grow_memory() assert(0 <= addr <= L 1); Bounds checks need dynamic updates

Bounds Checks + Growing Memory base+L 1 …. HEAP assert(0 <= addr <= L 1); base grow_memory() SLOP base+L+S Option 1: Patching assert(0 <= addr <= L 1); Difficult with threads Bad for code caching

Bounds Checks + Growing Memory base+L 1 …. HEAP assert(0 <= addr <= *LIMIT); base grow_memory() // LIMIT = L 1 SLOP base+L+S Option 2: Indirection assert(0 <= addr <= *LIMIT); Extra memory load

Bounds Checks + Growing Memory base+L 1 …. HEAP assert(0 <= addr <= MAX); base grow_memory() assert(0 <= addr <= MAX); SLOP base+L+S Option 3: Pre-declared MAX

Bounds Checks + Growing Memory base+L 1 …. HEAP assert(0 <= addr <= MAX); base grow_memory() assert(0 <= addr <= MAX); SLOP base+L MAX Option 3: Pre-declared MAX mmap up to MAX allocate pages on grow

Future Work Bounds Check Elimination with Loops Preserving precise exception semantics w/ BC hoisting Eliminating BCs for function calls

Lesson Safe stacks provide opportunities for reducing the overhead of SFI
![Lesson Safe stacks provide opportunities for reducing the overhead of safe stacks [1] Control-Flow Lesson Safe stacks provide opportunities for reducing the overhead of safe stacks [1] Control-Flow](http://slidetodoc.com/presentation_image_h2/65745f15ad0b3641b8945b868a07b94d/image-30.jpg)
Lesson Safe stacks provide opportunities for reducing the overhead of safe stacks [1] Control-Flow Bending: On the Effectiveness of Control-Flow Integrity, Carlini et al, USENIX Sec ‘ 15 [2] Losing Control: On the Effectiveness of Control-Flow Integrity under Stack Attacks, Conti et al CCS ‘ 15 [3] The Performance Cost of Shadow Stacks and Stack Canaries, Dang et al, CCS’ 15

Thank you! Q&A

WASM Example (func $fib (param $0 i 32) (result i 32) (local $1 i 32) (return (if i 32 (i 32. lt_s $1 3) 1 (block (set_local $1 (call fib (i 32. add $0 -1))) (i 32. add $1 (call $fib (i 32. add $0 -2)))))))

Web. Assembly - In Stores Now! In Firefox Nightly!

Web. Assembly - In Stores Now! In Firefox Nightly! Enable javascript. options. wasm in about: config

Web. Assembly - In Stores Now! In Firefox Nightly! Enable javascript. options. wasm in about: config Test it out at webassembly. github. io/demo

Why another language? Portable (thanks to precise semantics) Fast&Safe (close to actual hardware ISAs, lightweight sandbox) Small (carefully designed binary format)

WASM Tool emscripten

The Sandbox Memory accesses Jumps/Function Calls

The Sandbox - Memory Accesses Memory accesses require runtime checks Angry. Bots ~ 60 M accesses/sec

The Sandbox - Memory Accesses Memory accesses require runtime checks Angry. Bots ~ 60 M accesses/sec Do we really need all runtime checks?

Shoutouts! Dan Gohman, Luke Wagner, Michael Bebenita Wider JS/Moz. Research Team (Benjamin, Alon, Jakob…) Interns (London Rocked!) Academic Recruiting team (You guys Rocked!) Mozillians Community

Summary Sandboxing efficiently across 3 architectures x 3 OSs is hard. Many corner cases Difficult to test Security Critical BCE landing in next Aurora (after grow/resize memory 1287967) Web. Assembly is coming! It will be AWESOME! Get involved!

Bounds Checks + Loops uint 32_t addr = … for(; addr < end; addr++) assert(0 <= addr <= L); (*base + addr) = 0;

Bounds Checks + Loops* uint 32_t addr = … constant stride for(; addr < end; addr++) base assert(0 <= addr <= L); HEAP SLOP base+L+S (*base + addr) = 0; Constant Stride + Slop Allow BC Hoisting * ongoing work
- Slides: 44