Xen 3 0 and the Art of Virtualization
- Slides: 50
Xen 3. 0 and the Art of Virtualization Ian Pratt Xen. Source Inc. and University of Cambridge Keir Fraser, Steve Hand, Christian Limpach and many others…
Outline ¾Virtualization Overview ¾Xen Architecture ¾New Features in Xen 3. 0 ¾VM Relocation ¾Xen Roadmap ¾Questions
Virtualization Overview ¾Single OS image: Open. VZ, Vservers, Zones § Group user processes into resource containers § Hard to get strong isolation ¾ Full virtualization: VMware, Virtual. PC, QEMU § Run multiple unmodified guest OSes § Hard to efficiently virtualize x 86 ¾Para-virtualization: Xen § Run multiple guest OSes ported to special arch § Arch Xen/x 86 is very close to normal x 86
Virtualization in the Enterprise Consolidate under-utilized servers X Avoid downtime with VM Relocation Dynamically re-balance workload to guarantee application SLAs X X Enforce security policy
Xen 2. 0 (5 Nov 2005) ¾Secure isolation between VMs ¾Resource control and Qo. S ¾Only guest kernel needs to be ported § User-level apps and libraries run unmodified § Linux 2. 4/2. 6, Net. BSD, Free. BSD, Plan 9, Solaris ¾Execution performance close to native ¾Broad x 86 hardware support ¾Live Relocation of VMs between Xen nodes
Para-Virtualization in Xen ¾Xen extensions to x 86 arch § Like x 86, but Xen invoked for privileged ops § Avoids binary rewriting § Minimize number of privilege transitions into Xen § Modifications relatively simple and self-contained ¾Modify kernel to understand virtualised env. § Wall-clock time vs. virtual processor time • Desire both types of alarm timer § Expose real resource availability • Enables OS to optimise its own behaviour
Xen 3. 0 Architecture AGP ACPI PCI x 86_32 x 86_64 IA 64 VM 0 Device Manager & Control s/w VM 1 Unmodified User Software VM 2 Unmodified User Software Guest. OS (Xen. Linux) Back-End Native Device Drivers Control IF SMP Front-End Device Drivers Safe HW IF Front-End Device Drivers Event Channel Virtual CPU VM 3 Unmodified User Software Unmodified Guest. OS (Win. XP)) Front-End Device Drivers Virtual MMU Xen Virtual Machine Monitor Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE) VT-x
I/O Architecture ¾ Xen IO-Spaces delegate guest OSes protected access to specified h/w devices § Virtual PCI configuration space § Virtual interrupts § (Need IOMMU for full DMA protection) ¾ Devices are virtualised and exported to other VMs via Device Channels § Safe asynchronous shared memory transport § ‘Backend’ drivers export to ‘frontend’ drivers § Net: use normal bridging, routing, iptables § Block: export any blk dev e. g. sda 4, loop 0, vg 3 ¾ (Infiniband / “Smart NICs” for direct guest IO)
System Performance 1. 1 1. 0 0. 9 0. 8 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1 0. 0 L X V U SPEC INT 2000 (score) L X V U Linux build time (s) L X V U OSDB-OLTP (tup/s) L X V U SPEC WEB 99 (score) Benchmark suite running on Linux (L), Xen (X), VMware Workstation (V), and UML (U)
Scalability 1000 800 600 400 200 0 L X 2 L X 4 L X 8 L X 16 Simultaneous SPEC WEB 99 Instances on Linux (L) and Xen(X)
4 GB 3 GB 0 GB Xen S Kernel S User U ring 3 ring 1 ring 0 x 86_32 ¾ Xen reserves top of VA space ¾ Segmentation protects Xen from kernel ¾ System call speed unchanged ¾ Xen 3 now supports PAE for >4 GB mem
x 86_64 264 -247 Kernel U Xen S Reserved 247 User 0 U ¾ Large VA space makes life a lot easier, but: ¾ No segment limit support èNeed to use page-level protection to protect hypervisor
x 86_64 r 3 User Kernel U U syscall/sysret r 0 Xen S ¾ Run user-space and kernel in ring 3 using different pagetables § Two PGD’s (PML 4’s): one with user entries; one with user plus kernel entries ¾ System calls require an additional syscall/ret via Xen ¾ Per-CPU trampoline to avoid needing GS in Xen
Para-Virtualizing the MMU ¾Guest OSes allocate and manage own PTs § Hypercall to change PT base ¾Xen must validate PT updates before use § Allows incremental updates, avoids revalidation ¾Validation rules applied to each PTE: 1. Guest may only map pages it owns* 2. Pagetable pages may only be mapped RO ¾Xen traps PTE updates and emulates, or ‘unhooks’ PTE page for bulk updates
Writeable Page Tables : 1 – Write fault guest reads Virtual → Machine first guest write Guest OS page fault Xen VMM MMU Hardware
Writeable Page Tables : 2 – Emulate? guest reads Virtual → Machine first guest write Guest OS yes emulate? Xen VMM MMU Hardware
Writeable Page Tables : 3 - Unhook guest reads guest writes X Virtual → Machine Guest OS Xen VMM MMU Hardware
Writeable Page Tables : 4 - First Use guest reads guest writes X Virtual → Machine Guest OS page fault Xen VMM MMU Hardware
Writeable Page Tables : 5 – Re-hook guest reads Virtual → Machine guest writes Guest OS validate Xen VMM MMU Hardware
MMU Micro-Benchmarks 1. 1 1. 0 0. 9 0. 8 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1 0. 0 L X V Page fault (µs) U L X V U Process fork (µs) lmbench results on Linux (L), Xen (X), VMWare Workstation (V), and UML (U)
SMP Guest Kernels ¾Xen extended to support multiple VCPUs § Virtual IPI’s sent via Xen event channels § Currently up to 32 VCPUs supported ¾Simple hotplug/unplug of VCPUs § From within VM or via control tools § Optimize one active VCPU case by binary patching spinlocks ¾NB: Many applications exhibit poor SMP scalability – often better off running multiple instances each in their own OS
SMP Guest Kernels ¾ Takes great care to get good SMP performance while remaining secure § Requires extra TLB syncronization IPIs ¾ SMP scheduling is a tricky problem § Wish to run all VCPUs at the same time § But, strict gang scheduling is not work conserving § Opportunity for a hybrid approach ¾ Paravirtualized approach enables several important benefits § Avoids many virtual IPIs § Allows ‘bad preemption’ avoidance § Auto hot plug/unplug of CPUs
VT-x / Pacifica : hvm ¾ Enable Guest OSes to be run without modification § E. g. legacy Linux, Windows XP/2003 ¾ CPU provides vmexits for certain privileged instrs ¾ Shadow page tables used to virtualize MMU ¾ Xen provides simple platform emulation § BIOS, apic, iopaic, rtc, Net (pcnet 32), IDE emulation ¾ Install paravirtualized drivers after booting for high-performance IO ¾ Possibility for CPU and memory paravirtualization § Non-invasive hypervisor hints from OS
Guest VM (VMX) (32 -bit) Domain 0 Domain N Unmodified OS Linux xen 64 Unmodified OS FE Virtual Drivers Native Device Drivers Front end Virtual Drivers Native Device Drivers Linux xen 64 Backend Virtual driver 1/3 P Control Panel (xm/xend) 3 P Guest VM (VMX) (64 -bit) Guest BIOS Virtual Platform VMExit IO Emulation Callback / Hypercall Event channel 0 P Control Interface Processor Scheduler Event Channel Memory Xen Hypervisor Hypercalls I/O: PIT, APIC, IOAPIC 3 D 0 D
MMU Virtualizion : Shadow-Mode guest reads Virtual → Pseudo-physical Guest OS guest writes Accessed & dirty bits Updates Virtual → Machine VMM MMU Hardware
Xen Tools dom 0 dom 1 CIM xm Web svcs xmlib xenstore builder control save/ restore control libxc Priv Cmd dom 0_op Back xenbus Xen xenbus Front
VM Relocation : Motivation ¾VM relocation enables: § High-availability Xen • Machine maintenance § Load balancing • Statistical multiplexing gain Xen
Assumptions ¾Networked storage § NAS: NFS, CIFS § SAN: Fibre Channel § i. SCSI, network block dev § drdb network RAID ¾Good connectivity § common L 2 network § L 3 re-routeing Xen Storage
Challenges ¾VMs have lots of state in memory ¾Some VMs have soft real-time requirements § E. g. web servers, databases, game servers § May be members of a cluster quorum è Minimize down-time ¾Performing relocation requires resources è Bound and control resources used
Relocation Strategy Stage 0: pre-migration Stage 1: reservation Stage 2: iterative pre-copy Stage 3: stop-and-copy Stage 4: commitment VM active on host A Destination host selected (Block devices mirrored) Initialize container on target host Copy dirty pages in successive rounds Suspend VM on host A Redirect network traffic Synch remaining Activate on hoststate B VM state on host A released
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 1
Pre-Copy Migration: Round 2
Pre-Copy Migration: Round 2
Pre-Copy Migration: Round 2
Pre-Copy Migration: Round 2
Pre-Copy Migration: Round 2
Pre-Copy Migration: Final
Web Server Relocation
Iterative Progress: SPECWeb 52 s
Quake 3 Server relocation
Current Status x 86_32 Privileged Domains Guest Domains SMP Guests Save/Restore/Migrate >4 GB memory VT Driver Domains x 86_32 p x 86_64 IA 64 Power
3. 1 Roadmap ¾Improved full-virtualization support § Pacifica / VT-x abstraction § Enhanced IO emulation ¾Enhanced control tools ¾Performance tuning and optimization § Less reliance on manual configuration ¾NUMA optimizations ¾Virtual bitmap framebuffer and Open. GL ¾Infiniband / “Smart NIC” support
IO Virtualization ¾IO virtualization in s/w incurs overhead § Latency vs. overhead tradeoff • More of an issue for network than storage § Can burn 10 -30% more CPU ¾Solution is well understood § Direct h/w access from VMs • Multiplexing and protection implemented in h/w § Smart NICs / HCAs • Infiniband, Level-5, Aaorhi etc • Will become commodity before too long
Research Roadmap ¾Whole-system debugging § Lightweight checkpointing and replay § Cluster/distributed system debugging ¾Software implemented h/w fault tolerance § Exploit deterministic replay ¾Multi-level secure systems with Xen ¾VM forking § Lightweight service replication, isolation
Conclusions ¾ Xen is a complete and robust hypervisor ¾ Outstanding performance and scalability ¾ Excellent resource control and protection ¾ Vibrant development community ¾ Strong vendor support ¾ Try the demo CD to find out more! (or Fedora 4/5, Suse 10. x) ¾ http: //xensource. com/community
Thanks! ¾If you’re interested in working full-time on Xen, Xen. Source is looking for great hackers to work in the Cambridge UK office. If you’re interested, please send me email! ¾ian@xensource. com
- Xen and the art of virtualization
- Xen and the art of virtualization
- Paravirtualization interface
- Xen and the art of virtualization
- Xen and the art of virtualization
- Boris dragovic
- Art v!xen
- Open xen manager
- Ms.xen
- Implante xen glaucoma
- Xen vs vmware
- An đéc xen
- An đéc xen
- Hôm sau
- Patrick olschewski
- Vm introspection
- Logiciels de virtualisation
- Xen vs kvm
- Xen performance monitoring
- Riccardo brunetti
- Xen.ed features
- Xen 3
- Xen 3
- Xen
- Xen
- Xen
- Xen framework
- Xen architecture
- Hát kết hợp bộ gõ cơ thể
- Ng-html
- Bổ thể
- Tỉ lệ cơ thể trẻ em
- Voi kéo gỗ như thế nào
- Tư thế worms-breton
- Chúa yêu trần thế alleluia
- Môn thể thao bắt đầu bằng từ đua
- Thế nào là hệ số cao nhất
- Các châu lục và đại dương trên thế giới
- Cong thức tính động năng
- Trời xanh đây là của chúng ta thể thơ
- Mật thư anh em như thể tay chân
- 101012 bằng
- Phản ứng thế ankan
- Các châu lục và đại dương trên thế giới
- Thể thơ truyền thống
- Quá trình desamine hóa có thể tạo ra
- Một số thể thơ truyền thống
- Cái miệng xinh xinh thế chỉ nói điều hay thôi
- Vẽ hình chiếu vuông góc của vật thể sau
- Nguyên nhân của sự mỏi cơ sinh 8
- đặc điểm cơ thể của người tối cổ