A Survey of Distributed Task Schedulers Kei Takahashi

  • Slides: 27
Download presentation
A Survey of Distributed Task Schedulers Kei Takahashi (M 1)

A Survey of Distributed Task Schedulers Kei Takahashi (M 1)

What do you want to do on a grid? 4 Vast computing resources 4

What do you want to do on a grid? 4 Vast computing resources 4 Calculation power 4 Memory 4 Data storage 4 Large scale computation 4 Numerical simulations 4 Statistical analyses 4 Data mining . . for everyone 2

Grid Applications 4 For some applications, it is inevitable to develop parallel algorithms 4

Grid Applications 4 For some applications, it is inevitable to develop parallel algorithms 4 Dedicated to parallel environment 4 E. g. matrix computations 4 However, many applications are efficiently sped up by simply running multiple serial programs in parallel 4 E. g. many data intensive applications 3

Grid Schedulers 4 A system which distributes many serial tasks onto the grid environment

Grid Schedulers 4 A system which distributes many serial tasks onto the grid environment 4 Task assignments 4 File transfers 4 A user need not rewrite serial programs to execute them in parallel 4 Some constraints need to be considered 4 Machine availability 4 Machine spec (CPU/Memory/HDD), load 4 Data location 4 Task priority 4

An Example of Scheduling 4 Each task is assigned to a machine Scheduler Task

An Example of Scheduling 4 Each task is assigned to a machine Scheduler Task t 0 Heavy Task t 1 Light A (fast) A t 0 B t 1 Task t 2 Light B (slow) t 2 Shorter processing time A B t 1 t 2 t 0 5

Efficient Scheduling 4 Task scheduling in heterogeneous environment is not a new problem. Some

Efficient Scheduling 4 Task scheduling in heterogeneous environment is not a new problem. Some heuristics are already proposed. 4 However, existing algorithms could not appropriately handle some situations 4 Data intensive applications 4 Workflows 6

Data Intensive Applications 4 A computation using large data 4 Some gigabytes to petabytes

Data Intensive Applications 4 A computation using large data 4 Some gigabytes to petabytes 4 A scheduler need to consider the followings: 4 File transfer need to be diminished 4 Data replica should be effectively placed 4 Unused intermediate files should be cleared 7

An Example of Scheduling 4 Each task is assigned to a machine Scheduler Task

An Example of Scheduling 4 Each task is assigned to a machine Scheduler Task t 0 Heavy Requires : f 0 Task t 1 Light Requires : f 1 A (fast) B (slow) Task t 2 Light Requires : f 1 File f 0 Large File f 1 Small A f 0 B t 1 t 0 f 1 t 2 A f 1 B t 1 t 2 t 0 Shorter processing time 8

Workflow 4 A set of tasks with dependencies 4 Data dependency between some tasks

Workflow 4 A set of tasks with dependencies 4 Data dependency between some tasks 4 Expressed by a DAG Corpus Parsed Corpus Phrases (by words) Cooccurrence analysis Corpus Parsed Corpus Phrases (by words) Coocurrence analysis 9

Workflow (cont. ) 4 Workflow is suitable for expressing some grid applications 4 Only

Workflow (cont. ) 4 Workflow is suitable for expressing some grid applications 4 Only necessary dependency is described by a workflow 4 A scheduler can adaptively map tasks to the real node environment 4 More factors to consider 4 Some tasks are important to shorten the overall makespan 10

Agenda 4 Introduction 4 Basic Scheduling Algorithms 4 Some heuristics 4 Data-intensive/Workflow Schedulers 4

Agenda 4 Introduction 4 Basic Scheduling Algorithms 4 Some heuristics 4 Data-intensive/Workflow Schedulers 4 Conclusion 11

Basic Scheduling Heuristics 4 Given information : 4 ETC (expected completion time) for each

Basic Scheduling Heuristics 4 Given information : 4 ETC (expected completion time) for each pair of a node and a task, including data transfer cost 4 No congestion is assumed 4 Aim : minimizing the makespan (Total processing time) [1] Tracy et al. A Comparison Study of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems (TR-ECE 00 -04) 12

An example of ETC 4 ETC of (task, node) = (node available time) +

An example of ETC 4 ETC of (task, node) = (node available time) + (data transfer time) + (task process time) Available after Transfer Process ETC Node A 200 (sec) 100 (sec) 310 (sec) Node B 0 (sec) 100 (sec) Node C 0 (sec) A 1 Gbps 100 (sec) B Data 1 GB 100 Mbps 20 (sec) 120 (sec) C 13

Scheduling algorithms 4 An ETC matrix is given 4 When a task is assigned

Scheduling algorithms 4 An ETC matrix is given 4 When a task is assigned to a node, the ETC matrix is updated 4 An ETC matrix is consistent { if node M 0 can process a task faster than M 1, M 0 can process every other task faster than M } 4 The makespan of an inconsistent ETC matrix differs more than that of a consistent ETC matrix Assigned to A Node B Node C Task 0 8 1 5 Task 1 6 14 9 8 Task 2 2 10 3 4 14

Greedy approaches 4 Principles 4 Assign a task to the best node at a

Greedy approaches 4 Principles 4 Assign a task to the best node at a time 4 Need to decide the order of tasks 4 Scheduling priority 4 Min-min : Light task 4 Max-min : Heavy task 4 Sufferage : A task whose completion time differs most depending on the node 15

Max-min / Min-min 4 Calculate completion times for each task and node 4 For

Max-min / Min-min 4 Calculate completion times for each task and node 4 For each task take the minimum completion time 4 Take one from unscheduled tasks 4 Min-min : Choose a task which has “max” value 4 Max-min : Choose a task which has “max” value 4 Schedule the task to the best node Task 0 Task 1 node A 8 6 node B 1 node C 5 Min-min Max-min Task 2 2 9 3 8 4 16

Sufferage 4 For each task, calculate Sufferage (The difference between the minimum and second

Sufferage 4 For each task, calculate Sufferage (The difference between the minimum and second minimum completion times) 4 Take a task which has maximum Sufferage 4 Schedule the task to the best node Task 0 Task 1 Task 2 Node A 8 6 Node B 1 Sufferage = 4 5 9 Sufferage = 2 2 Sufferage = 1 3 8 4 Node C 17

Comparing Scheduling Heuristics 4 A simulation was done to compare some scheduling tactics [1]

Comparing Scheduling Heuristics 4 A simulation was done to compare some scheduling tactics [1] 4 Greedy (Max-min / Min-min) 4 GA, Simulated annealing, A*, etc. 4 ETC matrices were randomly generated 4 512 tasks, 8 nodes 4 Consistent, inconsistent 4 GA performed the shortest makespan in most cases, however the calculation cost was not negligible 4 Min-min heuristics performed well (at most 10% worse than the best) [1] Tracy et al. A Comparison Study of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems (TR-ECE 00 -04) 18

(Agenda) 4 Introduction 4 Scheduling Algorithms 4 Data-intensive/Workflow Schedulers 4 Gr. ADS 4 Phan’s

(Agenda) 4 Introduction 4 Scheduling Algorithms 4 Data-intensive/Workflow Schedulers 4 Gr. ADS 4 Phan’s approach 4 Conclusion 19

Scheduling Workflows 4 Additional Conditions to be considered 4 Task dependency 4 Every required

Scheduling Workflows 4 Additional Conditions to be considered 4 Task dependency 4 Every required file need to be transferred to the node before the task starts 4“Non-executable” schedule exists 4 Data are dynamically generated 4 The file location is not known in advance 4 Intermediate files are not needed at last 20

Gr. ADS [1] 4 Execution time estimation 4 Profile the application behavior 4 CPU/memory

Gr. ADS [1] 4 Execution time estimation 4 Profile the application behavior 4 CPU/memory usage 4 Data transfer cost 4 Greedy scheduling heuristics 4 Create ETC matrix for assignable tasks 4 After assigning a task, some tasks turn to “assignable” 4 Choose the best schedule from Max-min, min-min and Sufferage [1] Mandal. et al. "Scheduling Strategies for Mapping Application Workflows onto the Grid“ in IEEEInternational Symposium on High Performance Distributed Computing (HPDC 2005) 21

Gr. ADS (cont. ) 4 An experiment was done on real tasks 4 The

Gr. ADS (cont. ) 4 An experiment was done on real tasks 4 The original data (2 GB) was replicated to every cluster in advance 4 File transfer occurs in clusters 4 Comparing to random scheduler, it achieved 1. 5 to 2. 2 times better makespan 22

Scheduling Data-intensive Applications [1] 4 Co-scheduling tasks and data replication 4 Using GA 4

Scheduling Data-intensive Applications [1] 4 Co-scheduling tasks and data replication 4 Using GA 4 A gene contains the followings: 4 Task order in the global schedule 4 Assignment of tasks to nodes 4 Assignment of replicas to nodes 4 Only part of the tasks are scheduled at a time 4 Otherwise GA takes too long time [1] Phan et al. “Evolving toward the perfect schedule: Co-scheduling task assignments and data replication in wide-area systems using a genetic algorithm. ” In Proceedings of the 11 th Workshop on task Scheduling Strategies for Parallel Processing. Cambridge, MA. Springer-Verlag, Berlin, Germany. 23

(cont. ) 4 An example of the gene 4 One schedule is expressed in

(cont. ) 4 An example of the gene 4 One schedule is expressed in the gene Task order t 0 t 1 t 4 t 3 t 2 Task assignment t 0: n 0 t 1: n 1 t 2: n 0 t 3: n 1 t 4: n 0 Replicas d 0: n 0 d 1: n 1 d 2: n 0 t 1 t 0 t 2 t 3 n 0 t 4 n 1 t 0 t 4 t 1 t 3 t 2 24

(cont. ) 4 A simulation was performed 4 Compared to min-min heuristics with randomly

(cont. ) 4 A simulation was performed 4 Compared to min-min heuristics with randomly distributed replicas 4 Number of GA generations are fixed (100) 4 GA has not reached the best solution Makespan 4 When 40 tasks are scheduled at a time, GA performs twice better makespan 4 However, the difference decreases when more tasks are scheduled at a time 40 80 160 25

Conclusion 4 Some scheduling heuristics were introduced 4 Greedy (Min-min, Max-min, Sufferage) 4 Gr.

Conclusion 4 Some scheduling heuristics were introduced 4 Greedy (Min-min, Max-min, Sufferage) 4 Gr. ADS can schedule workflows by predicting node performance and using greedy heuristics 4 A research was done to use GA and coschedule tasks and data replication 26

Future Work 4 Most of the research is still on simulation 4 Hard to

Future Work 4 Most of the research is still on simulation 4 Hard to predict program/network behavior 4 A scheduler will be implemented 4 Using network topology information 4 Managing Intermediate files 4 Easy to install and execute 27