Budgetbased Control for Interactive Services with Partial Execution
Budget-based Control for Interactive Services with Partial Execution Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research 1
Motivation • Interactive services specify stringent SLA on response time • Long response time causes user dissatisfaction and revenue loss • Important to bound response time (e. g. mean, 95 -percentile) • Address two challenges • Adapt to dynamic and changing environment • Achieve high response quality GOAL: Develop a self-managed scheduling system to meet response time target while achieving high quality. 2
Existing Techniques (1) • Static admission control approach – Define a fixed queue length limit; drop requests when queue is full. • Issues – Only works under a static system. – Determining an appropriate queue-length for every setting and load is challenging. • Small queue length => underutilize resources • Large queue length => long response time – Can not adapt to dynamic and changing environment. 3
Existing Techniques (2) • Classic feedback control approach: – Feedback control on queue length • Decrease queue length when response time is above target – Issue • Dropping requests results in degraded quality • Does not consider partial execution of requests 4
Partial Execution & Response Quality • Incomplete execution of requests may still return meaningful partial results • Many interactive services support partial execution – Web search, web server, video streaming, finance server • Quality profile – A function maps request execution time to response quality 5
Our Contributions • Propose a budget-based control model for interactive services with partial execution – Use feedback control to meet response time target – Apply optimization procedure to improve response quality • Exploit partial execution and request quality profile • Evaluation – Implementation at Bing search server – Simulation on finance server 6
Budget-based Control Model • Control Variable – Budget: amount of computation time for all pending requests • Control mechanism – Determine the budget based on response time feedback – Control budget to meet response time • Optimization procedure – Given a budget, assign processing time to requests – Exploit partial results of a query – Scheduling to improve quality 7
Control Mechanism • Basic idea – If response time is larger than target, smaller budget – If response time is smaller than target, larger budget • Criteria – Meet response time target accurately and quickly – Incur little runtime overhead. 8
Control Mechanism: Background • Integral control – Adjust budget based on the difference between the observed and target response time – Advantage: eliminate steady-state error – Limitation: response is slow (long settling time) • Adaptive control – Model estimator + Linear quadratic optimal controller – Advantage: quick adaptation, fast response – Limitation: computationally expensive, stead-state error 9
Control Mechanism: Hybrid Control • Combine the integral and adaptive control • Run adaptive control periodically in a coarsegrain time interval • Use integral control for execution of each request for fine-grain adjustment • Meet our goal – Quick and accurate adaptation – Little runtime overhead. 10
Optimization Procedure • • Objective: maximize total response quality Input: budget, pending requests Output: assigned processing time to requests Optimization procedure depends on applications
Bing index server • Core part of Bing search – For a user query, match and rank docs, return top results • Concave quality profile – First-half of request execution receives higher quality gain than the second half. Quality 1 0, 9 0, 8 0, 7 0, 6 0, 5 0, 4 0, 3 0, 2 0, 1 0 0 0, 1 0, 2 0, 3 0, 4 0, 5 0, 6 0, 7 0, 8 Normalized Processing Time 0, 9 1 12
Optimization Procedure for Index Server • Run the portion of requests with higher gain • Prevent long requests from starving short ones • Combine two techniques – Reservation at light load: • Reserve time for later requests in the queue based on mean service demand – Equal sharing at heavy load: • Allocate resource equally among requests 13
Evaluation • Implemented and evaluated at Bing index server – Meet response time target and achieve high quality • Simulation study on finance server – Double system throughput at desired quality 14
Bing Index Server • Implementations – Budget. IS • Feedback control on budget • Hybrid control + optimization procedure – Queue. IS • Feedback control on queue length • Evaluation – Production trace 15
mean response time (ms) Compare Queue v. s. Budget Approach 50 40 Mean response time = 35 ms 30 20 Budget. IS 10 Queue. IS 0 200 250 300 350 QPS 400 450 500 1 0, 95 average quality Budget approach • Meet response time accurately • Achieve high quality 0, 9 0, 85 0, 8 0, 75 Budget. IS 0, 7 Queue. IS 0, 65 200 250 300 350 QPS 400 450 500 16
Conclusion • Propose a budget-based control optimization model for interactive services with partial execution – Hybrid control mechanism to meet response time target – Optimization procedure to improve response quality • Evaluation – Implemented and evaluated at Bing index server • Meet response time target and achieve high quality – Simulation study on finance server • Double system throughput at desired quality 17
Thank you & Questions? 18
- Slides: 18