v CAT Dynamic Cache Management using CAT Virtualization
- Slides: 37
v. CAT: Dynamic Cache Management using CAT Virtualization Meng Xu Linh Thi Xuan Phan Hyon-Young Choi Insup Lee Department of Computer and Information Science University of Pennsylvania
Trend: Multicore & Virtualization • Cyber physical systems are becoming increasingly complex – Require high performance and strong isolation • Virtualization on multicore help handle such complexity Increase performance and reduce cost Challenge: Harder to achieve timing isolation Collision avoidance Adaptive cruise control Pedestrian detection Infotainment VM 1 VM 2 VM 3 VM 4 Hypervisor 2
Problem: Shared cache interference • A task uses the cache to reduce its execution time • Concurrent tasks may access the same cache area Extra cache misses Increased WCET Intra-VM interference Inter-VM cache interference VM 1 1 2 VM 2 3 4 Tasks: 1 2 3 4 Hypervisor P 1 P 2 Collision P 3 P 4 Cache 3
Existing approach: Static management • Statically assign non-overlapped cache areas to tasks (VMs) • Pros: Simple to implement • Cons: Low cache resource utilization – Unused cache area of one task (VM) cannot be reused by another • Cons: Not always feasible – e. g. , when the whole task set does not fit into the cache VM 1 1 2 VM 2 3 4 Hypervisor P 1 P 2 P 3 Tasks: 1 2 3 4 P 4 4
Our approach: Dynamic management • Dynamically assign disjoint cache areas to tasks (VMs) • Pros: Enable cache reuse Better utilization of the cache Running tasks (VMs) can have larger cache areas, and thus smaller WCETs VM 1 1 2 VM 2 3 4 Hypervisor P 1 P 2 P 3 Tasks: 1 2 3 4 P 4 5
Our approach: Dynamic management • Challenge: How to achieve the efficient dynamic cache management while guaranteeing isolation? – Efficiency: The dynamic management should incur small overhead • Solution: Hardware-based – Increasingly many CPUs support the cache partitioning – Benefit: Cache reconfiguration can be done very efficiently Example: Intel processors that support cache partitioning Processor family Intel(R) Xeon(R) processor E 5 v 3 Intel(R) Xeon(R) processor D Number of COTS processors 6 out of 48 15 out of 15 Intel(R) Xeon(R) processor E 3 v 4 5 out of 5 Intel(R) Xeon(R) processor E 5 v 4 117 out of 117 Source: https: //github. com/01 org/intel-cmt-cat and http: //www. intel. com/ 6
Contribution: v. CAT • v. CAT: Dynamic cache management by virtualizing CAT – First work that achieves dynamic cache management for tasks in virtualization systems on commodity multicore hardware • Achieve strong shared cache isolation for tasks and VMs • Support the dynamic cache management for tasks and VMs – OS in VM can dynamically allocate cache partitions for its tasks – Hypervisor can dynamically reconfigure cache partitions for VMs • Support cache sharing among best-effort VMs and tasks 7
Outline • Introduction • Background: Intel CAT • Design & Implementation • Evaluation 8
Intel Cache Allocation Technology (CAT) • Divide the shared cache into α partitions (α = 20) – Similar to way-based cache partitioning • Provide two types of model-specific registers – Each core has a PQR register – K Class of Service (COS) registers shared by all cores (K = 4) 63 63 COS register ID PQR Reserved 31 9 20 Cache Bit Mask 0 0 COS Shared cache 9
Intel Cache Allocation Technology (CAT) • Divide the shared cache into α partitions (α = 20) – Similar to way-based cache partitioning • Provide two types of model-specific registers – Each core has a PQR register – K Class of Service (COS) registers shared by all cores (K = 4) • Configure cache partitions for a core – Step 1: Set the cache bit mask of the COS – Step 2: Link the core with a COS by setting PQR 63 63 COS register 1 ID PQR Reserved 31 9 20 Cache 0 x 0000 F Bit Mask 0 0 COS Shared cache 10
Intel CAT: Software support • Xen hypervisor supports Intel CAT – System operators can allocate cache partitions for VMs only • Pros: Mitigate the interference among VMs • Cons: Do not provide strong isolation among VMs • Cons: Do not allow a VM to manage partitions for its tasks – Tasks in the same VM can still interfere each other • Cons: Only support a limited number of VMs with different cache-partition settings – e. g. , the number of VMs with different cache-partition settings supported by Xen is ≤ 4 on our machine (Intel Xeon 2618 L v 3 processor). 11
Outline • Introduction • Background: Intel CAT • Design & Implementation • Evaluation 12
Goals • Dynamically control cache allocations for tasks and VMs – Each VM should control the cache allocation for its own tasks – The hypervisor should control the cache allocation for the VMs • Preserve the virtualization abstraction layer – Physical resources should not be exposed to VMs • Guarantee cache isolation among tasks and VMs – Tasks should not interfere with each other after the reconfiguration 13
Dynamic cache allocation for tasks • To modify the cache configuration of a task, VM needs to modify the cache control registers – BUT, cache control registers are only available to the hypervisor • One possible approach: Expose the registers to VMs VM Modify COS Hypervisor Core P 2 0 x. F COS register Physical cache 1 2 3 4 5 6 7 8 9 10 11 12 13 14 14
Dynamic cache allocation for tasks • To modify the cache configuration of a task, VM needs to modify the cache control registers – BUT, cache control registers are only available to the hypervisor • One possible approach: Expose the registers to VMs • Problem: Potential cache interference among VMs – e. g. , a VM may overwrite the hypervisor’s allocation decision VM Modify COS Hypervisor Core P 2 0 x. F 00 COS register Physical cache 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Dynamic cache allocation for tasks • To modify the cache configuration of a task, VM needs to modify the cache control registers – BUT, cache control registers are only available to the hypervisor • One possible approach: Expose the registers to VMs • Problem: Potential cache interference among VMs – e. g. , a VM may overwrite the hypervisor’s allocation decision VM Modify COS Validate the operation Hypervisor Core P 2 0 x. F 00 COS register Physical cache 1 2 3 4 5 6 7 8 9 10 11 12 13 14 16
Dynamic cache allocation for tasks • To modify the cache configuration of a task, VM needs to modify the cache control registers – BUT, cache control registers are only available to the hypervisor • One possible approach: Expose the registers to VMs • Problem: Potential cache interference among VMs – e. g. , a VM may overwrite the hypervisor’s allocation decision • Problem: Hypervisor needs to notify VMs for any changes VM Modify COS Validate the operation Hypervisor Core P 2 0 x. F COS register Physical cache 1 2 3 4 5 6 7 8 9 10 11 12 13 14 17
v. CAT: Key insight • Virtualize cache partitions and expose virtual caches to VMs – Hypervisor assigns virtual and physical cache partitions to VMs – VM controls the allocation of its assigned virtual partitions to tasks – Hypervisor translates VM’s operations on virtual partitions to operations on the physical partitions VM VM operates on its virtual cache Virtual cache Translate the operation Hypervisor P 2 Core 0 x. F 0 COS register Physical cache 1 2 3 4 5 6 7 8 9 10 11 12 13 14 18
Challenge 1: No control for cache hit requests • A task’s contents stay in cache until they are evicted • Problem: A task can access its content in its previous partitions via cache hits interfere with another task • Not explicitly documented in Intel’s SDM – We confirmed this limitation with experiments (available in the paper) Tasks Core Physical cache hit 1 2 3 4 Collision 5 6 7 8 9 10 11 12 13 14 19
Solution: Cache flushing • Task’s content in the previous partitions is no longer valid • Approach 1: Flush for each memory address of the task – Pros: Not affect the other tasks’ cache content – Cons: Slow when a task’s working set size is large (> 8. 46 MB) • Approach 2: Flush the entire cache – Pros: Efficient when a task’s working set size is large (> 8. 46 MB) – Cons: Flush the other tasks’ cache content as well • v. CAT provides both approaches to system operators – Discussion of the tradeoffs and flushing heuristics are in the paper Tasks Core Physical cache 1 2 3 4 5 6 7 8 9 10 11 12 13 14 20
Challenge 2: Contiguous allocation constraint • Unallocated partitions may NOT be contiguous Fragmentation of cache partitions in dynamic allocation Low cache resource utilization VM 1 VM 3 Virtual cache Physical cache VM 2 Invalid! 1 2 3 4 5 6 7 8 9 10 11 12 13 14 21
Solution: Partition defragmentation • Rearrange the partitions to form contiguous partitions – Hypervisor rearranges physical cache partitions for VMs – VM rearranges virtual cache partitions for tasks VM 1 VM 3 VM 2 Virtual cache Physical cache 1 2 3 4 5 6 7 8 9 10 11 12 13 14 22
v. CAT: Design summary • Introduce virtual cache partitions – Enable the VM to control the cache allocation for its tasks without breaking the virtualization abstraction • Flush the cache when the cache partitions of tasks (VMs) are changed – Guarantee cache isolation among tasks and VMs in dynamic cache management • Defragment non-contiguous cache partitions – Enables better cache utilization • Refer to the paper for technical details and other design considerations – e. g. , how to allocate and de-allocate partitions for tasks and VMs – e. g. , how to support an arbitrary number of tasks and VMs with different cache-partition settings 23
Implementation • Hardware: Intel Xeon 2618 L v 3 processor – Design works for any processors that support both virtualization and hardware-based cache partitioning • Implementation based on Xen 4. 8 and LITMUSRT 2015. 1 – LITMUSRT: Linux Testbed for Multiprocessor Scheduling in Real. Time Systems • 5 K Line of Code (Lo. C) in total – Hypervisor (Xen): 3264 Lo. C – VM (LITMUSRT): 2086 Lo. C • Flexible to add new cache management policies 24
Outline • Introduction • Background: Intel CAT • Design • Evaluation 25
v. CAT Evaluation: Goals • How much overhead is introduced by v. CAT? • How much WCET reduction is achieved through cache isolation? • How much real-time performance improvement v. CAT enables? – Static management vs. No management – Dynamic management vs. Static management 26
v. CAT Evaluation: Goals • How much overhead is introduced by v. CAT? • How much WCET reduction is achieved through cache isolation? • How much real-time performance improvement v. CAT enables? – Static management vs. No management – Dynamic management vs. Static management The rest of the evaluation is available in the paper 27
v. CAT Evaluation: Goals • How much overhead is introduced by v. CAT? • How much WCET reduction is achieved through cache isolation? • How much real-time performance improvement v. CAT enables? – Static management vs. No management – Dynamic management vs. Static management 28
v. CAT run-time overhead • Static cache management – Overhead occurs only when a task/VM is created – Negligible overhead: ≤ 1. 12 us • Dynamic cache management – Overhead occurs whenever the partitions of a task/VM are changed – Reasonably small overhead: ≤ 27. 1 ms – Value depends on the workload’s working set size (WSS) Overhead = min{3. 23 ms/MB × WSS, 27. 1 ms} • More details can be found in the paper – Computation of the overhead value based on the WSS – Experiments that show the factors that contribute to the overhead 29
v. CAT Evaluation: Goals • How much overhead is introduced by v. CAT? • How much WCET reduction is achieved through cache isolation? • How much real-time performance improvement v. CAT enables? – Static management vs. No management – Dynamic management vs. Static management 30
Static management: Evaluation setup • PARSEC benchmarks – Convert to LITMUSRT compatible real-time tasks • Randomly generate real-time parameters for the benchmarks to generate real-time tasks Benchmark VM PARSEC benchmarks … VM Pollute VM Cache-intensive task VP 1 VP 2 P 1 P 2 VP 3 VP 4 Pin to core Hypervisor Core P 3 P 4 Cache 31
Static management vs. No management Fraction of schedulable task sets Static management improves system utilization significantly Improve system utilization by 1. 0 / 0. 3 = 3. 3 x VCPU utilization No management Static management Real-time performance of streamcluster benchmark 32
Static management vs. No management The more cache sensitive the workload is, the more performance benefit is achieved 33
v. CAT Evaluation: Goals • How much overhead is introduced by v. CAT? • How much WCET reduction is achieved through cache isolation? • How much real-time performance improvement v. CAT enables? – Static management vs. No management – Dynamic management vs. Static management 34
Dynamic management: Evaluation setup • Create the workloads that have dynamic cache demand • Dual-mode tasks: Switch from mode 1 to mode 2 after 1 min – Type 1: Task increases its utilization by decreasing its period – Type 2: Task decreases its utilization by increasing its period Benchmark VM Pollute VM Type 1 dual-mode task Type 2 dual-mode task … … VM VP 1 VP 2 P 1 P 2 VP 3 VP 4 Hypervisor Core P 3 Cache-intensive task Pin to core P 4 Cache 35
Dynamic management vs. Static management Fraction of schedulable task sets Dynamic outperforms static significantly Improve system utilization by 0. 6/0. 2 = 3 x VCPU utilization Static management Dynamic management 36
Conclusion • v. CAT: A dynamic cache management framework for virtualization systems using CAT virtualization – Provide strong isolations among tasks and VMs – Support both static and dynamic cache allocations for both real-time tasks and best-effort tasks – Evaluation shows that dynamic management substantially improves schedulability compared to static management • Future work – Develop more sophisticated cache resource allocation policies for tasks and VMs in virtualization systems – Apply v. CAT to real systems, e. g. , automotive systems and cloud computing 37
- Cat 1 cat 2 cat 3 aviation
- Cat 1 cat 2 cat 3 aviation
- Qemu binary translation
- Transferered
- Application virtualization client management console
- I bought me a cat and the cat pleased me
- Dynamic hashing
- Binomial coefficient using dynamic programming
- What is dynamic storage allocation problem
- Dynamic relocation using a relocation register
- System.collections.generics
- Dtfd switch
- Dynamic fleet management
- Dynamic traffic management
- Adobe marketing cloud debugger chrome
- Strategic management a dynamic perspective
- Dynamic memory management
- Dynamic memory management
- Dtm traffic management
- Imitatability
- Dynamic capabilities and strategic management teece
- Java dynamic management kit
- Pentium 4 cache organization
- 3 types of cache misses
- Types of misses
- Cache set associative mapping
- Cache tag size raid 0
- John cuda
- Termasuk operasi pembacaan cache memory
- Karakteristik cache memory
- Cache fusion
- Direct mapping
- Ramcache
- Ligne de contour caché
- Techniques for reducing cache misses
- Cache coherence protocols
- Critical word first and early restart
- Write around cache