Dynamic Resource Allocation for Shared Data Centers Using

Overview Outline § Motivation § System Model § Dynamic Allocation Techniques § Experimental Results

Motivation § Data Centers § Server farms § Rent computing and storage resources to

Dynamic Resource Allocation § Periodically re-allocate resources among applications § Estimate resource requirements for

Research Contribution § Generalized processor sharing (GPS) § Time domain queuing model & Non-linear

Problem Formulation § Resource Model § Queue are assumed to be served in FIFO

§ Problem Definition § If denotes the target response time of application and is

Monitoring § Measure system and application metrics § Queue lengths § Request response times

Allocating § Invoked periodically to dynamically partition the resource capacity among the various applications

Time Domain Queuing Model § Transient queuing behavior over adaptation window § The request

Optimization-based Resource Allocation § Discontent function § Non-linear Optimization Problem: § Solved using Lagrange

Prediction § Short-term prediction of workload characteristics § Request arrival process § Service demand

Prediction Techniques § Estimating the Arrival Rate § Accurate estimate of allows the time

Experiments § Soccer World Cup’ 98 Traces § Results based on a 24 -hour

Experiments Evaluation § Synthetic Web Workload Comparison of static and dynamic resource allocations for

§ Trace-driven Web Workloads Comparison of static and dynamic resource allocations in the presence

Adaptation to Transient Overloads The workload and the resulting allocations in the presence of

Conclusions § Dynamic Resource Allocation needed for data centers § Measurement-based allocation: § Monitoring

Slides: 20

Download presentation

Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy

Overview Outline § Motivation § System Model § Dynamic Allocation Techniques § Experimental Results § Conclusions

Motivation § Data Centers § Server farms § Rent computing and storage resources to applications § Revenue for meeting Qo. S guarantees § Goals § Satisfy application Qo. S guarantees § Maximize resource utilization of platform § Robustness against “Slashdot” effects § Cluster of servers – Dedicated or Shared § Static Allocation is problematic

Dynamic Resource Allocation § Periodically re-allocate resources among applications § Estimate resource requirements for near future § Challenges § Reallocation at short time-scales § No prior workload profiling/knowledge § Low overhead § Approach: Online Measurement-based Allocation

Research Contribution § Generalized processor sharing (GPS) § Time domain queuing model & Non-linear optimization technique § Prediction algorithm § Synthetic Workloads & Real Web Traces

Problem Formulation § Resource Model § Queue are assumed to be served in FIFO order and the resource capacity C is shared among the queues using GPS § Queue is assigned a weight § Allocated a resource share in proportion to its weight. § GPS Scheduler

§ Problem Definition § If denotes the target response time of application and is its observed mean response time, then the application should be allocated a share , such that. § The discontent of an application grows as its response time deviates from the target di. This discontent function can be represented as follows § System goal then is to assign a share that the total system-wide discontent, i. e. , the quantity is minimized. to each application such

Dynamic Resource Allocation

Monitoring § Measure system and application metrics § Queue lengths § Request response times § Monitoring windows Measurement Interval Time History Adaptation Window

Allocating § Invoked periodically to dynamically partition the resource capacity among the various applications running on the shared server. § Resource Model Types § Time-domain Queuing Model § Online optimization-based Model

Time Domain Queuing Model § Transient queuing behavior over adaptation window § The request service rate is § Relation between mean response time T¯ and application share. Average response time in near future: § Relation is parameterized by the measured workload § Arrival rate λ and mean service time s¯

Optimization-based Resource Allocation § Discontent function § Non-linear Optimization Problem: § Solved using Lagrange multiplier method

Prediction § Short-term prediction of workload characteristics § Request arrival process § Service demand distribution § Use history of measured system metrics

Prediction Techniques § Estimating the Arrival Rate § Accurate estimate of allows the time domain queuing model to estimate the average queue length for the next adaptation window. § We represent Ai at any time by the sequence of values from the measurement history. § To predict , model using the AR(1), a sample value of Ai is estimated as § Estimating the Service Demand § Computes the probability distribution of the per-request service demands § Mean of the distribution is used to represent the service demand of application requests § Measuring the Queue Length § Monitoring module records the no. of outstanding requests at the beginning of each adaptation window.

Experiments § Soccer World Cup’ 98 Traces § Results based on a 24 -hour portion of the trace § 755, 000 requests § Mean req rate: 8. 7 req/sec § Mean req size: 8. 47 KB

Experiments Evaluation § Synthetic Web Workload Comparison of static and dynamic resource allocations for a synthetic web workload

§ Trace-driven Web Workloads Comparison of static and dynamic resource allocations in the presence of heavy-tailed request sizes and varying arrival rates.

Adaptation to Transient Overloads The workload and the resulting allocations in the presence of varying arrival rates and varying request sizes

Conclusions § Dynamic Resource Allocation needed for data centers § Measurement-based allocation: § Monitoring and Prediction gather online state § Use this state for application modeling and allocation § Results showed that these techniques can judiciously allocate system resources, especially under transient overload conditions

Thank You