Bulk Scheduling with DIANA Scheduler Ashiq Anjum 9252020
Bulk Scheduling with DIANA Scheduler Ashiq Anjum 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 1
Outline • What is DIANA( Data Intensive and Network Aware) Scheduling? • DIANA Scheduling algorithm • Bulk Scheduling Algorithm • Data Location Service • Network issues in scheduling data intensive jobs • Priority driven multi-queue feedback based approach • Results • Conclusions 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 2
Scheduling Problem(1) • Currently Grid scheduling decisions either data or computation intensive – In data intensive situations jobs are pushed to the data – In computation intensive situations data (Input as well as Output) may be pulled to the jobs. • This kind of scheduling can lead to – Performance degradation in a Grid environment – May result in large processing queues – Job execution delays due to site overloads. 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 3
Scheduling Problem(2) • Meta-scheduler and data mover can make conflicting decisions – Scheduling technologies were developed for LAN. – Limitations of the local resource management systems over longer distances – Applications becoming network dependent due to demanding requirements – Compute only Scheduling algorithms in-efficient – Large queues and processing delays due job movement towards the data. 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 4
Scheduling Problem(3) • Network characteristics and scheduling decisions – Need to use the network characteristics in aligning data and computations – Need to optimize the task queues of the Meta-Scheduler by this correlation. • A better scheduling algorithm than consider – The job execution, the data transfer and their relation with various network parameters on multiple site. • To maximize the Grid throughput and for performance gains – Align and co-schedule the computation and the data – Send both the data and executables to a location depending on the computing, network and storage resources. 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 5
Scheduling Problem(4) • In scientific analysis environments such as High Energy Physics, hundreds of end-users may submit individually or collectively thousands of jobs accessing some subset of the physics data distributed over the world; this is known as Bulk Scheduling. – Submit the job cluster to the scheduler as an unique entity – Jobs competition for scarce resources and disproportionate load distribution among the Grid nodes. – Previous approaches based on ‘greedy’ algorithms in which a job is submitted to a best resource without assessing the global cost of this action. • This may lead to a skewed distribution of resources • Large queues , performance and throughput degradation for the remaining jobs 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 6
DIANA Scheduling? • We present a DIANA scheduling system which – Allocates best available resources to a job – Checks the global state of jobs and resources so that the strategic output of the Grid is maximized – No single job can undergo starvation. – Takes Data and Network Aware Scheduling Decisions – Supports Bulk Scheduling 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 7
Multiple sites Grid Meta-Scheduling 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 8
Scheduling Parameters DIANA Scheduling decisions take into account • • 9/25/2020 Bandwidths and latencies (RTT) of the network Packet losses and jitter Network anomalies Computing cycles available Site loads and respective job queues Size of the application executables Size of data files (input , output and executables) Ashiq Anjum DIANA Scheduling CHEP 06 9
Cost Estimators • There are three major cost estimates which need to be calculated for the DIANA scheduling algorithm: • Network Cost • Computation Cost • Data transfer Cost 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 10
Network Cost § TCP Throughput < MSS: is the maximum segment size (fixed for each Internet path, typically 1460 bytes) RTT: is the round trip time (as measured by TCP) loss: is the packet loss rate. 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 11
Computation cost • Qi is the length of the waiting queue, Pi is the computing capability of the site i and Site. Load is the current load on that site. • W 5, W 6 and W 7 are the weights which can be assigned depending upon the importance of the queue and the processing capability. 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 12
Data Transfer Cost • Data Transfer Cost (DTC) = Input Data Transfer Cost + Output Data transfer cost + Executables transfer cost Where: • ID = Input Data • NC = Network Cost • AD =Application Data • OD = Output Data • i and j indicate a certain site 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 13
Total cost • Total Scheduling Cost of a site is: Total Cost C = Network Cost + Computation Cost + Data transfer Cost 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 14
Elements of the Scheduling Optimization We want to optimize the following elements within the DIANA scheduling process – – – Queue time and site load Processing time Data transfer time Executable transfer time Results transfer time The total time to execute a job in a Grid environment will be sum of all of these times. 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 15
Cost Matrix for the Meta-Scheduler 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 16
Example Matrix 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 17
DIANA Scheduler 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 18
Data Location Service What is it supposed to do? • Selection of the best replica taking into account the location of the computing resources and network and storage access latencies. • A light-weight web service that gathers information from the network monitoring service and performs access optimization calculations based on this information. 9/25/2020 • To provide optimal replica information on the basis of both faster access and better performance • Decentralized and fault tolerant characteristics (when one goes offline you still can discover other instances of the services). Ashiq Anjum DIANA Scheduling CHEP 06 19
Interaction of the DIANA Scheduler with the Data Location Service 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 20
Multi level queue and Bulk scheduling 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 21
Multilevel feedback queues and Bulk Scheduling 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 22
Bulk Scheduling Algorithm(1) • A priority driven multi-queue feedback based approach • Job priority is important while scheduling the bulk jobs. • Aging technique to overcome Starvation. • Bulk jobs in a single burst will be submitted at a single site. • Not a pre-emptive scheduling approach. • No ‘Round Robin’ approach inside queues 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 23
Bulk Scheduling Algorithm(2) • Job migration between priority queues. • Three types of priorities: user, quota and system centric. • Little’s Formula for Job Migration (N= R*W) • While migrating the jobs, the scheduler queries all the sites for their average load and the one with the minimum cost is selected. • No migration if the data to be transferred is too large or the speed of the network connections is too low. • Time threshold (with an increasing number of jobs, the priority of jobs from a user starts to decrease) • Job threshold (the more time a job has to wait the more its priority continues to increase ) 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 24
Priority with Time and Job Frequency 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 25
Queue time versus number of jobs 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 26
Execution time versus number of jobs 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 27
9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 28
9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 29
Conclusions • Network characteristics are directly linked with scheduling decisions for an optimized Grid • Data Location can significantly influence the scheduling decisions • It is not always justified to send jobs towards the data • Queue length should also be considered while placing the Grid jobs • Priority oriented approaches for bulk scheduling can improve the resource usage • Greedy schedules can create starvation so global cost of a job placement should always be considered • A multi-criteria based strategy for job scheduling is key to a better Grid throughput • Job Migration to under-utilized resources can reduce the average execution time of all the jobs submitted to Grid. 9/25/2020 Ashiq Anjum DIANA Scheduling CHEP 06 30
- Slides: 30