Virtualization Part 1 Concepts XEN Virtualization Concepts References

  • Slides: 29
Download presentation
Virtualization Part 1 – Concepts & XEN

Virtualization Part 1 – Concepts & XEN

Virtualization Concepts References and Sources n n n James Smith, Ravi Nair, “The Architectures

Virtualization Concepts References and Sources n n n James Smith, Ravi Nair, “The Architectures of Virtual Machines, ” IEEE Computer, May 2005, pp. 32 -38. Mendel Rosenblum, Tal Garfinkel, “Virtual Machine Monitors: Current Technology and Future Trends, ” IEEE Computer, May 2005, pp. 39 -47. L. H. Seawright, R. A. Mac. Kinnon, “VM/370 – a study of multiplicity and usefulness, ” IBM Systems Journal, vol. 18, no. 1, 1979, pp. 4 -17. S. T. King, G. W. Dunlap, P. M. Chen, “Operating System Support for Virtual Machines, ” Proceedings of the 2003 USENIX Technical Conference, June 9 -14, 2003, San Antonio TX, pp. 71 -84. A. Whitaker, R. S. Cox, M. Shaw, S. D. Gribble, “Rethinking the Design of Virtual Machine Monitors, ” IEEE Computer, May 2005, pp. 57 -62. G. J. Popek, and R. P. Goldberg, “Formal requirements for virtualizable third generation architectures, ” CACM, vol. 17 no. 7, 1974, pp. 412 -421. CS 5204 – Fall, 2008 2

Virtualization Definitions n Virtualization A layer mapping its visible interface and resources onto the

Virtualization Definitions n Virtualization A layer mapping its visible interface and resources onto the interface and resources of the underlying layer or system on which it is implemented ¨ Purposes ¨ n n Abstraction – to simplify the use of the underlying resource (e. g. , by removing details of the resource’s structure) Replication – to create multiple instances of the resource (e. g. , to simplify management or allocation) Isolation – to separate the uses which clients make of the underlying resources (e. g. , to improve security) Virtual Machine Monitor (VMM) A virtualization system that partitions a single physical “machine” into multiple virtual machines. ¨ Terminology ¨ n n Host – the machine and/or software on which the VMM is implemented Guest – the OS which executes under the control of the VMM CS 5204 – Fall, 2008 3

Virtualization Origins - Principles “an efficient, isolated duplicate of the real machine” n Efficiency

Virtualization Origins - Principles “an efficient, isolated duplicate of the real machine” n Efficiency ¨ n Resource control ¨ n Innocuous instructions should execute directly on the hardware Executed programs may not affect the system resources Equivalence ¨ The behavior of a program executing under the VMM should be the same as if the program were executed directly on the hardware (except possibly for timing and resource availability) Communications of the ACM, vol 17, no 7, 1974, pp. 412 -421 CS 5204 – Fall, 2008 4

Virtualization Origins - Principles Instruction types n Privileged an instruction traps in unprivileged (user)

Virtualization Origins - Principles Instruction types n Privileged an instruction traps in unprivileged (user) mode but not in privileged (supervisor) mode. n Sensitive ¨Control sensitive – attempts to change the memory allocation or privilege mode ¨Behavior sensitive Location sensitive – execution behavior depends on location in memory n Mode sensitive – execution behavior depends on the privilege mode Innocuous – an instruction that is not sensitive n n Theorem For any conventional third generation computer, a virtual machine monitor may be constructed if the set of sensitive instructions for that computer is a subset of the set of privileged instructions. Signficance The IA-32/x 86 architecture is not virtualizable. CS 5204 – Fall, 2008 5

Virtualization Origins - Technology IBM Systems Journal, vol. 18, no. 1, 1979, pp. 4

Virtualization Origins - Technology IBM Systems Journal, vol. 18, no. 1, 1979, pp. 4 -17. n n n Concurrent execution of multiple production operating systems Testing and development of experimental systems Adoption of new systems with continued use of legacy systems Ability to accommodate applications requiring special-purpose OS Introduced notions of “handshake” and “virtual-equals-real mode” to allow sharing of resource control information with CP Leveraged ability to co-design hardware, VMM, and guest. OS CS 5204 – Fall, 2008 6

Virtualization VMMs Rediscovered Application Guest OS Virtual Machine VMM Real Machine n n n

Virtualization VMMs Rediscovered Application Guest OS Virtual Machine VMM Real Machine n n n Server/workload consolidation (reduces “server sprawl”) Compatible with evolving multi-core architectures Simplifies software distributions for complex environments “Whole system” (workload) migration Improved data-center management and efficiency Additional services (workload isolation) added “underneath” the OS ¨ ¨ security (intrusion detection, sandboxing, …) fault-tolerance (checkpointing, roll-back/recovery) CS 5204 – Fall, 2008 7

Virtualization Architecture & Interfaces n Architecture: formal specification of a system’s interface and the

Virtualization Architecture & Interfaces n Architecture: formal specification of a system’s interface and the logical behavior of its visible resources. Applications API Libraries ABI System Calls Operating System ISA User ISA Hardware n n n API – application binary interface ABI – application binary interface ISA – instruction set architecture CS 5204 – Fall, 2008 8

Virtualization VMM Types n n System Process Provides ABI interface ¨ Efficient execution ¨

Virtualization VMM Types n n System Process Provides ABI interface ¨ Efficient execution ¨ Can add OS-independent services (e. g. , migration, intrustion detection) ¨ Provdes API interface ¨ Easier installation ¨ Leverage OS services (e. g. , device drivers) ¨ Execution overhead (possibly mitigated by justin-time compilation) ¨ CS 5204 – Fall, 2008 9

Virtualization System-level Design Approaches n Full virtualization (direct execution) ¨ ¨ ¨ Exact hardware

Virtualization System-level Design Approaches n Full virtualization (direct execution) ¨ ¨ ¨ Exact hardware exposed to OS Efficient execution OS runs unchanged Requires a “virtualizable” architecture Example: VMWare n Paravirtualization ¨ ¨ ¨ OS modified to execute under VMM Requires porting OS code Execution overhead Necessary for some (popular) architectures (e. g. , x 86) Examples: Xen, Denali CS 5204 – Fall, 2008 10

Virtualization Design Space (level vs. ISA) API interface n n ABI interface Variety of

Virtualization Design Space (level vs. ISA) API interface n n ABI interface Variety of techniques and approaches available Critical technology space highlighted CS 5204 – Fall, 2008 11

Virtualization System VMMs Type 1 n Structure ¨ ¨ n Primary goals ¨ ¨

Virtualization System VMMs Type 1 n Structure ¨ ¨ n Primary goals ¨ ¨ n Type 1: runs directly on host hardware Type 2: runs on Host. OS Type 1: High performance Type 2: Ease of construction/installation/acceptability Examples ¨ ¨ Type 1: VMWare ESX Server, Xen, OS/370 Type 2: User-mode Linux CS 5204 – Fall, 2008 Type 2 12

Virtualization Hosted VMMs n Structure ¨ ¨ ¨ n Goals ¨ ¨ n Improve

Virtualization Hosted VMMs n Structure ¨ ¨ ¨ n Goals ¨ ¨ n Improve performance overall leverages I/O device support on the Host. OS Disadvantages ¨ ¨ n Hybrid between Type 1 and Type 2 Core VMM executes directly on hardware I/O services provided by code running on Host. OS Incurs overhead on I/O operations Lacks performance isolation and performance guarantees Example: VMWare (Workstation) CS 5204 – Fall, 2008 13

Virtualization Whole-system VMMs n n n Challenge: Guest. OS ISA differs from Host. OS

Virtualization Whole-system VMMs n n n Challenge: Guest. OS ISA differs from Host. OS ISA Requires full emulation of Guest. OS and its applications Example: Virtual. PC CS 5204 – Fall, 2008 14

Virtualization Strategies Guest. OS n De-privileging ¨ privileged instruction ¨ ¨ ¨ trap resource

Virtualization Strategies Guest. OS n De-privileging ¨ privileged instruction ¨ ¨ ¨ trap resource emulate change n Primary/shadow structures vmm ¨ resource ¨ ¨ n VMM emulates the effect on system/hardware resources of privileged instructions whose execution traps into the VMM aka trap-and-emulate Typically achieved by running Guest. OS at a lower hardware priority level than the VMM Problematic on some architectures where privileged instructions do not trap when executed at deprivileged priority VMM maintains “shadow” copies of critical structures whose “primary” versions are manipulated by the Guest. OS e. g. , page tables Primary copies needed to insure correct environment visible to Guest. OS Memory traces ¨ ¨ Controlling access to memory so that the shadow and primary structure remain coherent Common strategy: write-protect primary copies so that update operations cause page faults which can be caught, interpreted, and emulated. CS 5204 – Fall, 2008 15

Virtualization Virtualizing the IA-32 (x 86) architecture n Architecture has protection rings 0. .

Virtualization Virtualizing the IA-32 (x 86) architecture n Architecture has protection rings 0. . 3 with OS normally in ring 0 and applications in ring 3… n …and VMM must run in ring 0 to maintain its integrity and control n …but Guest. OS not running in ring 0 is problematic: ¨ ¨ ¨ ¨ Some privileged instructions execute only in ring 0 but do not fault when executed outside ring 0 (remember privileged vs. sensitive? ) instructions for low latency system calls (SYSENTER/SYSEXIT) always transition to ring 0 forcing the VMM into unwanted emulation or overhead For the Itanium architecture, interrupt registers only accessible in ring 0; forcing VMM to intercept each device driver access to these registers has severe performance consequences Masking interrupts can only be done in ring 0 Ring compression: paging does not distinguish privilege levels 0 -2, Guest. OS must run in ring 3 but is then not protected from its applications also running in ring 3 Cannot be used for 64 -bit guests on IA-32 The fact that it is not running in ring 0 can be detected (is this important? ) CS 5204 – Fall, 2008 16

Virtualization VMM machine Memory Management OS physical process virtual entity address space Guest. OS

Virtualization VMM machine Memory Management OS physical process virtual entity address space Guest. OS VMM n n “shadow” page tables Isolation/protection of Guest OS address spaces Efficient MM address translation page tables CS 5204 – Fall, 2008 17

Virtualization XEN: paravirtualization Computer Laboratory References and Sources n n Paul Barham, et. al.

Virtualization XEN: paravirtualization Computer Laboratory References and Sources n n Paul Barham, et. al. , “Xen and the Art of Virtualization, ” Symposium on Operating Systems Principles 2003 (SOSP’ 03), October 19 -22, 2003, Bolton Landing, New York. Presentation by Ian Pratt available at http: //www. cl. cam. ac. uk/netos/papers/2005 -xen-may. ppt CS 5204 – Fall, 2008 18

Virtualization Xen - Structure n Employs paravirtualization strategy Deals with machine architectures that cannot

Virtualization Xen - Structure n Employs paravirtualization strategy Deals with machine architectures that cannot be virtualized ¨ Requires modifications to guest OS ¨ Allows optimizations ¨ n “Domain 0” has special access to control interface for platform management ¨ Has back-end device drivers ¨ n Xen VMM entirely event driven ¨ no internal threads ¨ Xen 3. 0 Architecture CS 5204 – Fall, 2008 19

Virtualization MMU Virtualizion : Shadow-Mode guest reads Virtual → physical Guest OS guest writes

Virtualization MMU Virtualizion : Shadow-Mode guest reads Virtual → physical Guest OS guest writes Accessed & dirty bits Updates Virtual → Machine VMM MMU CS 5204 – Fall, 2008 Hardware 20

Virtualization MMU Virtualization : Direct-Mode guest reads Virtual → Machine guest writes Guest OS

Virtualization MMU Virtualization : Direct-Mode guest reads Virtual → Machine guest writes Guest OS Xen VMM MMU CS 5204 – Fall, 2008 Hardware 21

Virtualization Writeable Page Tables : 1 – write fault guest reads Virtual → Machine

Virtualization Writeable Page Tables : 1 – write fault guest reads Virtual → Machine first guest write Guest OS page fault Xen VMM MMU CS 5204 – Fall, 2008 Hardware 23

Virtualization Writeable Page Tables : 2 - Unhook guest reads guest writes X Virtual

Virtualization Writeable Page Tables : 2 - Unhook guest reads guest writes X Virtual → Machine Guest OS Xen VMM MMU CS 5204 – Fall, 2008 Hardware 24

Virtualization Writeable Page Tables : 3 - First Use guest reads guest writes Virtual

Virtualization Writeable Page Tables : 3 - First Use guest reads guest writes Virtual → Machine X Guest OS page fault Xen VMM MMU CS 5204 – Fall, 2008 Hardware 25

Virtualization Writeable Page Tables : 4 – Re-hook guest reads Virtual → Machine guest

Virtualization Writeable Page Tables : 4 – Re-hook guest reads Virtual → Machine guest writes Guest OS validate Xen VMM MMU CS 5204 – Fall, 2008 Hardware 26

Virtualization I/O n Safe hardware interfaces ¨ ¨ I/O Spaces n Restricts access to

Virtualization I/O n Safe hardware interfaces ¨ ¨ I/O Spaces n Restricts access to I/O registers n Driver isolated from VMM in its own “domain” (i. e. , VM) Communication between domains via device channels Isolated Device Drive n n Unified interfaces ¨ ¨ Common interface for group of similar devices Exposes raw device interface (e. g. , for specialized devices like sound/video) n Separate request/response from event notification n I/O descriptor rings ¨ ¨ ¨ Used to communicate I/O requests and responses For bulk data transfer devices (DMA, network), buffer space allocated out of band by Guest. OS Descriptor contains unique identifier to allow out of order processing Multiple requests can be added before hypercall made to begin processing Event notification can be masked by Guest. OS for its convenience CS 5204 – Fall, 2008 27

Virtualization Device Channels n n n Connects “front end” device drivers in Guest. OS

Virtualization Device Channels n n n Connects “front end” device drivers in Guest. OS with “native” device driver Is an I/O descriptor ring Buffer page(s) allocated by Guest. OS and “granted” to Xen Buffer page(s) is/are pinned to prevent page-out during I/O operation Pinning allows zero-copy data transfer CS 5204 – Fall, 2008 28

Virtualization System Performance 1. 1 1. 0 0. 9 0. 8 0. 7 0.

Virtualization System Performance 1. 1 1. 0 0. 9 0. 8 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1 0. 0 L X V U SPEC INT 2000 (score) Linux build time (s) L X V U OSDB-OLTP (tup/s) SPEC WEB 99 (score) Benchmark suite running on Linux (L), Xen (X), VMware Workstation (V), and UML (U) n Benchmark suites ¨ ¨ n Spec INT 200: compute intensive workload Linux build time: extensive file I/O, scheduling, memory management OSBD-OLTP: transaction processing workload, extensive synchronous disk I/O Spec WEB 99: web-like workload (file and network traffic) Fair comparison? CS 5204 – Fall, 2008 29

Virtualization I/O Peformance n Systems ¨ ¨ ¨ n L: Linux IO-S: Xen using

Virtualization I/O Peformance n Systems ¨ ¨ ¨ n L: Linux IO-S: Xen using IO-Space access IDD: Xen using isolated device driver Benchmarks ¨ ¨ ¨ Linux build time: file I/O, scheduling, memory management PM: file system benchmark OSDB-OLTP: transaction processing workload, extensive synchronous disk I/O httperf: static document retrievel Spec. Web 99: web-like workload (file and network traffic) CS 5204 – Fall, 2008 30