Virtual Machine Disk Images Introspection and a bit
Virtual Machine Disk Images Introspection and a bit more. . . Vasily Tarasov (SBU) Dean Hildebrand (IBM Almaden) Renu Tewari (IBM Almaden) Erez Zadok (SBU) File system and Storage Lab (FSL)
Outline • How all that started • The idea of introspection • A couple of results from a 1 st prototype • Future work • Benchmarking, Filebench
Two important technologies Virtual Machines (VMs) - Computational resources consolidation - Flexible, efficient and scalable - Hardware support - Multiple solutions: VMWare, KVM, Xen, . . . - Cloud-way of delivering services Network Attached Storage (NAS) - Storage consolidation - Scalable, manageable and efficient - NFS/CIFS available on majority of Operating Systems - NAS sales jumped from $540 M in 1998 to $5. 1 B in 2003 - IBM SONAS
Two technologies… Dean VM NAS
…and they grow Dean VM NAS
How do VM & NAS work together? Can we make them work better? VM IBM SONAS
Typical Setup VMWare, KVM, XEN, . . . Virtual Machines Host 1 VM 1 -2 Virtual Machines Host 2 VM 2 -2 Virtual Machines Host 3 VM 3 -2 NFS CLIENT VM 3 -1 NFS CLIENT VM 2 -1 NFS CLIENT VM 1 -1 Storage 1 -1 GPFS Node 1 NFS SERVER 2 Storage 1 -2 Storage 2 -1 GPFS Node 2 Storage 2 -2 Storage 3 -1 GPFS Node 1 3 Storage 3 -2 Storage 4 -1 GPFS Node 1 4 Storage 4 -2
Datapath Decomposed RM – Read-Ahead – Request Mangling and Scheduling Virtual File System On-Disk File System Block Layer CA RA RA RM Controller Driver Host RA – CAching VM Guest CA Applications Controller Emulator RM CA RA NFS Client RM NETWORK NFS Server NAS Virtual File System On-Disk File System Block Layer Controller Driver RM CA RA RA RM CA RA RM RM
Collecting traces: setup Rand/Seq Read Rand/Seq Write Various I/O sizes Multi-file workloads Multi-process workloads Meta-data intensive VMWare ESX 4 NFS Server Within VM trace 1 Gbps VSCSI Layer Trace Block Layer Trace Network Trace
Applications Virtual File System On-Disk File System Block Layer Controller Driver Host Rand/Seq Read Rand/Seq Write Various I/O sizes Multi-file workloads Multi-process workloads Meta-data intensive VM Guest User-Space Workload Collecting traces: setup Network Trace VSCSI Layer Trace Controller Emulator NFS Client NETWORK NFS Server NAS Virtual File System On-Disk File System Block Layer Controller Driver Block Layer Trace
Some interesting results VM Guest Applications 4 MB Virtual File System On-Disk File System 4 KB Block Layer 1 MB Controller Driver 128 KB Host Controller Emulator NFS Client 32 KB NETWORK NFS Server NAS Virtual File System On-Disk File System Block Layer Controller Driver 256 KB I/O sizes change WIOV’ 11 - Revisiting the Storage Stack in Virtualized NAS Environments
Meta-data Ops Data Ops Non-VM case # stat /foo/bar sys_stat(/foo/bar) NFS_GETATTR(foobar_fh) VM case Update attributes # stat /foo/bar List directories Creation/deletion Lookup Access permissions sys_stat(/foo/bar) Link/Symlink operations NFS_READ(dskimg_fh) NFS_WRITE(dskimg_fh)
Come up with an idea Disk Image File Ext, NTFS, UFS, . . . What is located in this region? READ(dskfh, offset, len) Offset Size NFS Server READ from: Inode Directory entry Data of specific file . . . Do intelligent things!
Prototype Results: Find 80% improvement find
Prototype Results: Startup 2. 6 x times faster 130 sec 50 sec
Future work • Solid implementation • More efficient cache policies • Optimizations on the write path • Analysis of more complex workloads
Virtual Machine Disk Images Introspection a bit more. . .
A Recent Study Concluded that… 1. Much of what researchers conclude in their studies is misleading, exaggerated, or flat-out wrong 2. A new claim about a research findings is more likely to be false than true 3. Researchers tend to publish positive results more Hot. OS’ 11: Benchmarking often than negative findings FS Benchmarking: It is Rocket Science 4. Chances be accepted to a conference are higher if 2005 -2008 to study J. Ioannidis thebyresults are “more exciting” A Medicine B D 5/4/2011 Biology Sociology E C Computer Science Physics 18
Filebench • Originally created by SUN Microsystem (RIP ) • Maintained by FSL • Used in many papers • Flexible: Workload Model Language – WML • Portable: Linux, Free. BSD, Solaris, Mac. OS, Windows *
Filebench WML define fileset name=myfileset, size=16 kb, entries=1000 define process name=reader, instances=1 { thread name=readerthread, memsize=10 m, instances=10 { flowop read name=myread, filesetname=myfileset, iosize=2 kb } }
Filebench for Cloud Services flowops: • Reads • Writes • Creates POSIX NFS RPC • Deletes AFS RPC • +20 more sophisticated Cloud
Filebench for Virtualized Environments define hypervisor name=hpv, type=esx 3. 1, instances=1 { define vm name=hpv, type=windows, instances=5 { define process name=reader, instances=1 { thread name=readerthread, memsize=10 m, instances=10 { flowop read name=myread 1, filesetname=myfileset, … } }
Virtual Machine Disk Images Introspection and a bit more. . . Thank you! Vasily Tarasov (SBU) Dean Hildebrand (IBM Almaden) Renu Tewari (IBM Almaden) Erez Zadok (SBU) File system and Storage Lab (FSL)
- Slides: 23