Virtualization and the Cloud Chapter 7 Tanenbaum Bo
Virtualization and the Cloud Chapter 7 Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Virtualization • Virtualization General term for referring to presenting an abstract view of system resources We will look at • Process virtualization • Virtualization inside the OS • Machine virtualization • April 27, 2015 © 2013 -2015 Paul Krzyzanowski 2
Process Virtualization April 27, 2015 © 2013 -2015 Paul Krzyzanowski 3
Process Virtualization • JAVA – Java virtual machine (JVM) – Java runtime environment (JRE) • Android – Davlik virtual machine – Android programs written in Java – Compiled to bytecode for JVM – Translated to Davlik bytecode Successor Android Runtime (ART) (uses the same bytecode) - April 27, 2015 © 2013 -2015 Paul Krzyzanowski 2
Process Virtual Machines • CPU interpreter running as a process • Pseudo-machine with interpreted instructions – – – – 1966: O-code for BCPL 1973: P-code for Pascal 1995: Java Virtual Machine (JIT compilation added) 2002: Microsoft. NET CLR (pre-compilation) 2003: QEMU (dynamic binary translation) 2008: Dalvik VM for Android 2014: Android Runtime (ART) – ahead of time compilation • Advantage: run anywhere, sandboxing capability • No ability to even pretend to access the system hardware – Just function calls to access system functions – Or “generic” hardware April 27, 2015 © 2013 -2015 Paul Krzyzanowski 14
Virtualization inside the OS April 27, 2015 © 2013 -2015 Paul Krzyzanowski 3
Virtualization inside the OS IO / file virtualization – abstract interface to physical connections Storage virtualization – Logical view of disks “connected” to a machine – External pool of storage CPU/Machine virtualization – Each process feels like it has its own CPU – Created by OS preemption and scheduler Memory /CPU virtualization – Process feels like it has its own address space – Created by MMU, configured by OS April 27, 2015 © 2013 -2015 Paul Krzyzanowski 2
I/O Virtualizations Wikipedia
Logical Volume Management • Physical disk – Divided into one or more Physical Volumes • Logical partitions – Volume Groups – Created by combining Physical Volumes • May span multiple physical disks – Can be resized – Each can hold a file system April 27, 2015 © 2013 -2015 Paul Krzyzanowski 4
Mapping Logical to Physical data • Storage on physical volumes is divided into clusters (misnamed extents): fixed-size chunks • Logical volume defined and managed by mapping of logical extents to physical extents • Logical Volume Manager (LVM) takes care of this mapping April 27, 2015 © 2013 -2015 Paul Krzyzanowski 5
LVM Mapping Concatenate multiple physical disks to create a larger disk PV 2 PV 1 LV 0 PV 0 April 27, 2015 © 2013 -2015 Paul Krzyzanowski 6
Advantages • Logical disks can be resized while mounted – Some file systems (e. g. , ext 3 on Linux or NTFS) support dynamic resizing • Data can be relocated from one disk to another • Improved performance (through disk striping) • Improved redundancy (disk mirroring) • Snapshots – Save the state of the volume at some point in time. – Allow backups to proceed while the file system is being modified April 27, 2015 © 2013 -2015 Paul Krzyzanowski 8
Storage Virtualization • Dissociate knowledge of physical disks – The computer system does not manage physical disks • Software between the computer and the disks manages the view of storage • Virtualization software translates read-block / write-block requests for logical devices to read-block / write-block requests for physical devices April 27, 2015 © 2013 -2015 Paul Krzyzanowski 10
Storage Virtualization • Logical view of disks “connected” to a machine • Separate logical view from physical storage • External pool of storage Host 1 Host 2 Virtualization appliance . . . Fibre-channel or i. SCSI switch Replication Snapshots Pooling Partitioning Host n April 27, 2015 © 2013 -2015 Paul Krzyzanowski 11
Virtual CPUs (sort of) What time-sharing operating systems give us • Each process feels like it has its own CPU & memory – But cannot execute privileged instructions (e. g. , modify the MMU or the interval timer, halt the processor, access I/O) • Illusion created by OS preemption, scheduler, and MMU • User software has to “ask the OS” to do system-related functions. April 27, 2015 © 2013 -2015 Paul Krzyzanowski 13
Machine Virtualization April 27, 2015 © 2013 -2015 Paul Krzyzanowski 3
Machine Virtualization Normally all hardware and I/O managed by one operating system • Machine virtualization – Abstract (virtualize) control of hardware and I/O from the OS – Partition a physical computer to act like several real machines • Manipulate memory mappings • Set system timers • Access devices – Migrate an entire OS & its applications from one machine to another • 1972: IBM System 370 April 27, 2015 © 2013 -2015 Paul Krzyzanowski 16
Requirements for Virtualization Hypervisors should score well in three dimensions: 1. Safety: hypervisor should have full control of virtualized resources. 2. Fidelity: behavior of a program on a virtual machine should be identical to same program running on bare hardware. 3. Efficiency: much of code in virtual machine should run without intervention by hypervisor. Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
A matter of privilege OS works in two modes • Kernel that can run privileged instructions • User that cannot Now we want a guest OS to run in user mode. . Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Machine Virtualization An OS is just a bunch of code! • Privileged vs. unprivileged instructions • Regular applications use unprivileged instructions – Easy to virtualize • If regular applications execute privileged instructions, they trap • VM catches the trap and emulates the instruction – Trap & Emulate April 27, 2015 © 2013 -2015 Paul Krzyzanowski 17
Hypervisor • Hypervisor: Program in charge of virtualization – – Aka Virtual Machine Monitor Provides the illusion that the OS has full access to the hardware Arbitrates access to physical resources Presents a set of virtual device interfaces to each host April 27, 2015 © 2013 -2015 Paul Krzyzanowski 18
Hypervisor Application or Guest OS runs until: – – Privileged instruction traps System interrupts Exceptions (page faults) Explicit call: VMCALL (Intel) or VMMCALL (AMD) Operating System & Applications Unprivileged Page Fault Instruction virtual IRQ Fault MMU emulation CPU or device emulation I/O emulation Privileged Hypervisor (Virtual Machine Monitor) April 27, 2015 © 2013 -2015 Paul Krzyzanowski 19
Hypervisor architecture Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Three Approaches to Running VMs Native VM (hypervisor model) Hosted VM Paravirtualization April 27, 2015 © 2013 -2015 Paul Krzyzanowski 24
Native Virtual Machine Example: VMware ESX Native VM (or Type 1 or Bare Metal) – No primary OS – Hypervisor is in charge of access to the devices and scheduling – OS runs in “kernel mode” but does not run with full privileges Applications OS OS OS Virtual Machine Monitor (Hypervisor) Device driver Physical Machine April 27, 2015 © 2013 -2015 Paul Krzyzanowski 25
Hosted Virtual Machine Hosted VM – VMM runs without special privileges – Primary OS responsible for access to the raw machine Example: VMware Workstatio n • Lets you use all the drivers available for that primary OS – Guest operating systems run under a VMM – VMM invoked by host OS • Serves as a proxy to the host OS for access to devices Applications Guest OS Applications Host OS VMM Device emulation Device driver VM Physical Machine April 27, 2015 © 2013 -2015 Paul Krzyzanowski 26
Virtual Machines Rediscovered Paravirtualization Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Techniques for Efficient Virtualization When the operating system in a virtual machine executes a kernel only instruction, it traps to the hypervisor if virtualization technology is present. Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Virtualizing the Unvirtualizable The binary translates rewrites the guest operating system running in ring 1, while the hypervisor runs in ring 0 Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Virtualizing the Unvirtualizable Binary Translation Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Virtualization Technology 2005 – Intel CPU’s introduced Virtual Technology (VT) Containers are created in which virtual machines can be run. I/O instructions, etc, Trap in the hypervisor who does the work on behalf of the virtual machine. Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Architectural Support • Intel Virtual Technology • AMD Opteron • Guest mode execution: can run privileged instructions directly – E. g. , a system call does not need to go to the VM – Certain privileged instructions are intercepted as VM exits to the VMM – Exceptions, faults, and external interrupts are intercepted as VM exits – Virtualized exceptions/faults are injected as VM entries April 27, 2015 © 2013 -2015 Paul Krzyzanowski 22
Hardware support for virtualization Root mode (Intel example) – Layer of execution more privileged than the kernel apps RING 3 Non-root mode privilege levels RING 1 RING 0 Guest mode privilege level Guest OS Without virtualization Root mode privilege level VMM Guest OS RING 3 RING 2 syscall RING 2 apps RING 0 OS requests trap to VMM hardware April 27, 2015 © 2013 -2015 Paul Krzyzanowski 21
Memory Virtualization Hypervisor creates a shadow page table that maps virtual pages used by virtual machines to actual pages the hypervisor gives it. Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Virtualizing Memory • Similar to OS-based virtual memory – An OS sees a contiguous address space Shadow Page Table Guest Page Table System Memory – But it is not necessarily tied to physical memory • Need to virtualize MMU – Two levels of translation: Shadow page tables • Host allocates virtual memory for guest – Guest treats that as physical memory • Guest OS cannot access real page tables – Access attempts are trapped and emulated • VMM maps guest “physical memory” settings to actual memory – Second-level address translation (SLAT) = Nested page tables • Hardware support in MMU – similar to multilevel page tables – Performance enhancement over shadow page tables • A guest’s physical address is treated as a virtual address April 27, 2015 © 2013 -2015 Paul Krzyzanowski 27
Hardware Support For Nested Page Tables Extended/nested page tables are walked every time a guest physical address is accessed—including the accesses for each level of the guest’s page tables. Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Memory Virtualization Reclaiming memory Hypervisor pretends that the total memory for all VMs combined is more than the actual memory. Deduplication : Pages sharing the same content are shared. Ballooning : A small balloon module is loaded in the VM as a pseudo device driver that talks to the hypervisor. inflates – memory scarcity on guest increases deflates – more memory becomes available for guest Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Scheduling VMs • Each VM competes for a physical CPU – Typically # VMs > # CPUs • VMs need to get scheduled – Each VM gets a time slice • Often round robin scheduler – or minor variations – Allocate CPU to a single-CPU VM – Allocate multiple CPUs to multi-CPU VMs: co-scheduling • Strict co-scheduler: VM with two virtual CPUs gets two real CPUs • Relaxed co-scheduler: if two CPUs are not available, use one – CPU affinity: try to run the VM on the same CPU • VM scheduler controls the level of multiprogramming of VMs April 27, 2015 © 2013 -2015 Paul Krzyzanowski 29
I/O Virtualization Problem: Each guest thinks it owns an entire disk partition. • Hypervisor creates a file or region and gives it to the OS Problem: the disk the guest OS is using is different from the real one. • Hypervisor converts disk commands to drive the real disk • Allows upgrades to hardware without changing software. Problem: Networking link for each guest OS • Each VM has its own MAC address Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Virtualizing Drivers & Events • Operating systems cannot interact directly with I/O devices • Device drivers – VMM has to multiplex physical devices & create network bridges – Virtualize network interfaces (e. g. , MAC addresses) – Guest OS gets device drivers that interface to an abstract device implementation provided by the VMM • VMM gets all system interrupts and exceptions – Needs to figure out which OS gets a simulated interrupt – Simulate those events on the guest OS April 27, 2015 © 2013 -2015 Paul Krzyzanowski 31
Switch Virtual LAN Host B Host A Bridge 02: 01: 0 A: 00: 01 10. 0. 1. 1/24 02: 01: 0 A: 00: 02: 01 10. 0. 2. 1/24 vm vm 02: 01: 0 A: 00: 01: 03 10. 0. 1. 3/24 vm 02: 01: 93: 60: 51: f 1 157. 96, 81, 241/24 Bridge Virtual LAN PUBLIC ACCESS Internet vm 02: 01: 0 A: 00: 02 10. 0. 2. 2/24 vm
BSD Jails – Directory subtree Root of namespace. Process cannot escape from this subtree – Hostname that will be used within the jail – IP address used for a process within the jail – Command that will be run within the jail April 27, 2015 © 2013 -2015 Paul Krzyzanowski 36
Some Popular VM Platforms • Native VMs – Microsoft Hyper-V – VMWare ESX Server – IBM z/VM (mainframe) – Xen. Server • Ran under an OS and provides virtual containers for running other operating systems. Runs a subset of x 86. Routes all hardware accesses to the host OS. • Non-modified OS support for processors that support x 86 virtualization – Sun x. VM Server • Hosted VMs – VMWare Workstation – Virtual. Box – Parallels April 27, 2015 © 2013 -2015 Paul Krzyzanowski 33
Security Threats • Hypervisor-based rootkits • A system with no virtualization software installed but with hardware-assisted virtualization can have a hypervisorbased rootkit installed. • Rootkit runs at a higher privilege level than the OS. – It’s possible to write it in a way that the kernel will have a limited ability to detect it. April 27, 2015 © 2013 -2015 Paul Krzyzanowski 34
OS-Level Virtualization • Not full machine virtualization • Multiple instances of the same operating system – Each has its own environment • Process list, mount table, file descriptors, virtual network interface • Advantage: low overhead: no overhead to system calls • Examples: – Linux VServer, Solaris Containers, Free. BSD Jails – Symantec Software Virtualization Solution (originally Altris Software Virtualization Services) • Windows registry & directory tweaking • Allows multiple instances of applications to be installed April 27, 2015 © 2013 -2015 Paul Krzyzanowski 35
Defining the Cloud Model for enabling the delivery of computing as a SERVICE.
Clouds National Institute of Standards and Technology defines characteristics of “cloud” 1. On-demand self-service 2. Broad network access 3. Resource pooling 4. Rapid elasticity 5. Measured service Tanenbaum & Bo, Modern Operating Systems: 4 th ed. , (c) 2013 Prentice-Hall, Inc. All rights reserved.
Service Models Software as a Service Saa. S Platform as a Service Paa. S Infrastructure as a Service Iaa. S NIST Definition
Cloud computing layers http: //en. wikipedia. org/wiki/Cloud_computing
Deployment Models *Public Cloud *Private Cloud *Hybrid Cloud *Community Cloud NIST Definition
Iaa. S is the delivery of computer hardware as a service • Servers • Networks • Storage Iaa. S
Client Cloud Interface VM 1 VM 4 VM 2 VM 5 VM 3 VM 6 VM 7
All key machine’s components, CPU, storage disks, networking and memory are completely virtualized. This facilitates the entire state of the virtual machine that must be captured and moved. Various techniques: • Live (hot or real time) migration : VM is powered on • Regular (cold) migration: VM is powered off • Live Storage migration
Live Migration • Select alternate host (B) – Mirror block devices (for file systems) – Initialize VM on B • Initialize – Copy dirty pages to host B iteratively • To migrate – – Suspend VM on A Send ARP message to redirect traffic to B Synchronize remaining VM state to B Release state on A April 27, 2015 © 2013 -2015 Paul Krzyzanowski 32
There isn’t one approach to Paa. S. The line between Iaas and Paas is blurred Common Paa. S Characteristics: • Offers development environment Development lifecycle, language Ability to develop, test and deploy applications Customer uses this to add value • Support well-defined interfaces for: – Composite applications – Portals – Mashups (brings together 2 or more business apps) • Based on multi-tenancy architecture Paa. S
Provides a specialized capability, such as a tool or tool set Ex. Amazon’s Simple DB and Simple Query Service Paa. S
30 years ago…… Time-sharing systems …. . Saa. S model today motivated by Faster, ubiquitous networked communications Software costs and complexities IT costs Saa. S
Focus on a specific process, such as performance reviews, financial management… Moved to the cloud because customers are finding the platforms hard to manage Characteristics –Designed with a specific business processes build in –Modifiable by customers Examples: Intuit, SAP, Oracle On Demand Saa. S
VIM: Virtualization Infrastructure Management HOST Web Server DB Email Server Facebook app DB Java App A App B App C Window H Linux Guest OS Virtual Machine Monitor (Hypervisor) HARDWARE Iaa. S
- Slides: 59