Looking Inside the Virtualization Layer for Performance Security

























- Slides: 25
Looking Inside the Virtualization Layer for Performance, Security and Software Fault-Tolerance Sorav Bansal IIT Delhi
Virtualization Software • • • VMware Workstation/ESX Server Citrix Xen. Server Microsoft Hyper-V Virtual Iron Parallels Desktop …
Classification of Virtual Machine Monitors • Binary Translation – VMware (1998) • Hardware-Assisted Virtualization – VMware, Hyper-V, Xen. Server, Virtual Iron, … • Para-virtualization – Xen. Server
Missing Features • Optimize code • Security • Bug-tolerance
What are we doing • A virtualization layer for x 86 from grounds-up – Runs unmodified OS – Can dynamically optimize code (binary translation) – Can specify security policies enforceable at instruction-level granularity – Can record and replay an execution – Can install on an existing OS – Transparent to user – Simple
Traditional Picture Application 1 Application 2 OS Hardware
Virtualized Picture Application 1 Application 2 OS Optimizing VMM
Translation Blocks • Divide code into “translation blocks” – A translation block ends if • Reach a control-flow instruction • Or, MAX_INSNS instructions have been translated
A Simple Scheme x: Original code fragment Binary Translator tx: Translated code fragment
Use a Cache x: Original code fragment Binary Translator tx: found not-found Translated code fragment save Translation Cache Lookup using x
Direct Jump Chaining Ta a b c d lookup(b) lookup(c) Tb Tc lookup(d) Td
Indirect Jumps call push b jmp Tf Ta a Tf f b ret tmp JTABLE[retaddr & MASK] if (tmp. src == retaddr) goto tmp. dst lookup(retaddr) Tb pop retaddr
14, 5% 15 13, 0% 12, 9% 10 9, 1% 5, 3% 5 default 0, 3% tf in pr ha no i 3 i 2 no -1, 7% -3, 1% ha i 1 no ha r ite o_ op ylo pt -5 em bs or t -0, 8% no jumptable 0, 3% fib 0 bu Overhead (Percentage of Native) 20 -5, 9% -10 -15 -9, 0% Lower is Better
36 x Overhead (Percentage of Native) 400 11 x 46 x 350 300 250 200 default 156, 7% no jumptable 150 114, 3% 100 37, 4% 50 0 euclid fibo_rec erastothenes Lower is Better
710 x 45 x fibo_rec no chaining no jumptable default 1. 1 x logarithmic scale 58% printf -6% no chaining no jumptable default -9% Overheads
Effect of Maximum Size of Translation Block bubsort 250 200 150 100 50 0 1 3 5 7 9 11 13 15 17 19 21 23 -50 erhead emptyloop 2000 1500 1000 50 500 0 1 3 5 7 9 11 13 15 17 19 21 23 fibo_iter 250 100 50 0 1 3 5 7 9 11 13 15 17 19 21 23 300 200 100 0 1 3 5 7 9 11 13 15 17 19 21 23 200 100 0 5 7 -100 1 3 5 7 9 11 13 15 17 19 21 23 hanoi 2 300 3 printf 400 1 1 2 3 4 5 6 7 8 9101112131415161718192021222324 500 hanoi 1 300 250 200 150 100 50 0 0 fibo_rec 500 400 300 200 100 0 200 euclid 9 11 13 15 17 19 21 23 -100 1 3 5 7 hanoi 3 350 300 250 200 150 100 50 9 11 13 15 17 19 21 23 0 1 3 5 7 9 11 13 15 17 19 21 23 Max Size of Translation Block
clock random Effect of Translation Cache Size 30 25 45 40 35 30 25 20 15 10 5 0 bubsort 20 15 10 5 0 -5 erhead 8 7 6 5 4 3 2 1 0 45 40 16 17 18 20 22 24 32 64 96 128 250 100 50 0 35 150 100 50 0 90 80 70 60 50 40 30 20 10 0 16 17 18 20 22 24 32 64 96 128 150 hanoi 1 200 -50 12 30 hanoi 2 10 20 4 15 15 2 10 10 0 5 -2 0 16 17 18 20 22 24 32 64 96 128 -4 16 17 18 20 22 24 32 64 96 128 hanoi 3 20 6 25 erastothenes 25 8 30 16 17 18 20 22 24 32 64 96 128 printf 200 16 17 18 20 22 24 32 64 96 128 euclid 250 16 17 18 20 22 24 32 64 96 128 300 fibo_iter 300 emptyloop 16 17 18 20 22 24 32 64 96 128 5 0 16 17 18 20 22 24 Number of 4 k pages in Translation Cache
Optimizations • Peephole Optimizations • Trace Optimizations • Cross-layer optimizations
An Example ld ld M, r 1 M, r 0 ld M, r 0 mov r 0, r 1
Interrupts ld ld M, r 1 M, r 0 ld M, r 0 mov r 0, r 1 Delay Interrupt delivery till end of current translation
Precise Exceptions ret page fault handler ld (sp), t 0 add $4, sp … Page fault jmp t 0 rollback code sub $4, sp restore t 0
Security: A Simple Scheme to Prevent Stack-Overflows call ret … push ra, shadow … … ra pop ra 1 pop shadow if (ra != ra 1) error …
Record-Replay • Record – Direct I/O (in instructions) – Interrupts – Memory-mapped I/O • Can use this to tolerate certain classes of bugs
Slowdowns with Record/Replay Program Slowdown bubsort 216 x emptyloop 507 x euclid 320 x fibo_iter 282 x fibo_rec 309 x hanoi 1 236 x hanoi 2 182 x hanoi 3 233 x printf 7 x
Conclusions • The virtualization layer is a good place to do many interesting things • Can we make the virtual machine appear _________ than the real machine? faster more secure more reliable