Presentation of Licentiate Thesis Scheduling and Optimization of
- Slides: 56
Presentation of Licentiate Thesis Scheduling and Optimization of Fault. Tolerant Embedded Systems Viacheslav Izosimov Embedded Systems Lab (ESLAB) Linköping University, Sweden 1 of 14 1
Motivation § Hard real-time applications § § Time-constrained Cost-constrained Fault-tolerant etc. § Focus on transient faults and intermittent faults 2 of 14 2
Motivation Transient faults § Happen for a short time § Corruptions of data, miscalculation in logic § Do not cause a permanent damage of circuits Electromagnetic interference (EMI) § Causes are outside system boundaries Radiation Lightning storms 3 of 14 3
Motivation Intermittent faults Transient faults § Manifest similar as transient faults § Happen repeatedly § Causes are inside system Crosstalk boundaries Internal EMI Init (Data) Power supply fluctuations Software errors (Heisenbugs) 4 of 14 4
Motivation Transient faults are more likely to occur as the size of transistors is shrinking and the frequency is growing Errors caused by transient faults have to be tolerated before they crash the system However, fault tolerance against transient faults leads to significant performance overhead 5 of 14 5
Motivation § Hard real-time applications § § Time-constrained Cost-constrained Fault-tolerant etc. The Need for Design Optimization of Embedded Systems with Fault Tolerance 6 of 14 6
Outline § Motivation è Background and limitations of previous work § Thesis contributions: § Scheduling with fault tolerance requirements § Fault tolerance policy assignment § Checkpoint optimization § Trading-off transparency for performance § Mapping optimization with transparency § Conclusions and future work 7 of 14 7
General Design Flow System Specification Architecture Selection Mapping & Hardware / Software Partitioning Fault Tolerance Techniques Feedback loops Scheduling Back-end Synthesis 8 of 14 8
Fault Tolerance Technique Re-execution Error-detection overhead N 1 PP 1/1 1 P 1/2 Recovery overhead Rollback recovery with checkpointing Checkpointing overhead N 1 1 P 1 2 P 1 0 P 1 20 N 1 1 P 1 40 1 2 60 PP 1/1 1 2 P 1/2 Active replication N 1 P 1(1) N 2 P 1(2) 9 of 14 9
Limitations of Previous Wo § Design optimization with fault tolerance is limited § Process mapping is not considered together with fault tolerance issues § Multiple faults are not addressed in the framework of static cyclic scheduling § Transparency, if at all addressed, is restricted to a whol computation node 10 of 1410
Outline § Motivation § Background and limitations of previous work è Thesis contributions: § Scheduling with fault tolerance requirements § Fault tolerance policy assignment § Checkpoint optimization § Trading-off transparency for performance § Mapping optimization with transparency § Conclusions and future work 11 of 1411
Fault-Tolerant Time-Triggered Syste Transient faults Processes: Re-execution, Active Replication, Rollback Recovery with Checkpointing … P 1 m 1 Messages: Fault-tolerant predictable protocol m 2 P 5 P 2 P 3 P 4 Maximum k transient faults within each application run (system period) 12 of 1412
Scheduling with Fault Tolerance Reqirements Conditional Scheduling Shifting-based Scheduling 13 of 1413
Conditional Schedulin k = 2 P 1 m 1 0 20 40 60 80 100 120 140 160 180 200 P 2 true P 1 0 P 2 14 of 1414
Conditional Schedulin k = 2 P 1 m 1 0 20 P 2 40 60 80 100 120 140 160 180 200 P 2 true P 1 0 P 2 40 15 of 1415
Conditional Schedulin k = 2 P 1 PP 1/1 1 m 1 0 20 P 1/2 40 60 80 100 120 140 160 180 200 P 2 true 45 P 1 0 P 2 40 16 of 1416
Conditional Schedulin k = 2 P 1/1 m 1 0 20 P 1/2 40 60 80 P 1/3 P 2 100 120 140 160 180 200 P 2 true 45 P 1 0 P 2 40 90 130 17 of 1417
Conditional Schedulin k = 2 P 1/1 m 1 0 20 P 1/2 40 60 PP 2/1 2 80 P 2/2 100 120 140 160 180 200 P 2 true 45 P 1 0 P 2 40 90 130 85 140 18 of 1418
Conditional Schedulin k = 2 P 1 m 1 0 20 PP 2/1 2 40 60 P 2/2 80 P 2/3 100 120 140 160 180 200 P 2 true 45 P 1 0 P 2 40 90 130 85 140 95 150 19 of 1419
Fault-Tolerance Conditional Process Gra 1 P 11 k = 2 P 1 m 1 P 21 m 11 2 P 12 m 12 4 P 22 P 24 3 P 25 2 P 23 5 3 P 13 3 m 1 6 P 26 Conditional Scheduling 20 of 1420
Conditional Schedule Tab P 1 N 1 m 1 k = 2 N 2 P 2 true P 1 45 0 90 m 1 40 130 85 P 2 50 140 95 150 105 160 21 of 1421
Conditional Schedulin § Conditional scheduling: É Generates short schedules É Allows to trade-off between transparency and performance (to be discussed later. . . ) – Requires a lot of memory to store schedule tables – Scheduling algorithm is very slow § Alternative: shifting-based scheduling 22 of 1422
Shifting-based Schedulin § Messages sent over the bus should be scheduled at one time § Faults on one computation node must not affect other computation nodes É Requires less memory É Schedule generation is very fast – Schedules are longer – Does not allow to trade-off between transparency and performance (to be discussed later. . . ) 23 of 1423
Ordered FT-CPG 1 P 1 k = 2 P 2 after P 1 P 2 m 1 m 3 m 2 1 4 2 P 2 3 P 2 P 4 P 3 2 P 2 3 P 1 5 S S m 2 m 1 P 3 after P 4 P 1 6 P 2 S m 3 1 P 4 1 P 3 2 P 3 3 P 3 2 P 4 4 P 3 5 P 3 3 P 4 5 P 3 24 of 1424
Root Schedules Recovery slack for P 1 and P 2 Bus P 2 P 1 Worst-case scenario for P 1 P 4 P 3 m 3 N 2 P 1 m 2 N 1 25 of 1425
Extracting Execution Scenari P 1 P 2 N 2 m 1 m 2 Bus P 4/1 P 4/2 P 4/3 P 3 m 3 N 1 26 of 1426
Memory Required to Store Schedule Tab 20 proc. 40 proc. 60 proc. k=1 k=2 k=3 k=1 k=2 80 proc. k=3 k=1 k=2 k=3 1. 73 0. 71 2. 09 4. 35 1. 18 4. 21 8. 75 100% 0. 13 0. 28 0. 54 0. 36 0. 89 1. 73 4. 96 1. 20 4. 64 11. 55 2. 01 8. 40 21. 11 75% 0. 22 0. 57 1. 37 0. 62 2. 06 4. 96 8. 09 1. 53 7. 09 18. 28 2. 59 12. 21 34. 46 50% 0. 28 0. 82 1. 94 0. 82 3. 11 8. 09 12. 56 1. 92 10. 00 28. 31 3. 05 17. 30 51. 30 25% 0. 34 1. 17 2. 95 1. 03 4. 3412. 56 16. 72 2. 16 11. 72 34. 62 3. 41 19. 28 61. 85 0% 0. 39 1. 42 3. 74 1. 17 5. 6116. 72 F Applications with more frozen nodes require less memory 27 of 1427
Memory Required to Store Root Sched 20 proc. 40 proc. 60 proc. k=1 k=2 k=3 k=1 k=2 100% 0. 016 0. 034 0. 03 0. 054 80 proc. k=3 k=1 k=2 k=3 0. 070 1. 73 F Shifting-based scheduling requires very little memo 28 of 1428
Schedule Generation Time and Qual Shifting-based scheduling requires 0. 2 seconds to generate a root schedule for application of 120 processes and 10 faults Conditional scheduling already takes 319 seconds to generate a schedule table for application of 40 processes and 4 faults F Shifting-based scheduling much faster than conditional scheduling ~15% worse than conditional scheduling with 100% inter-processor messages set to frozen (in terms of fault tolerance overhead) 29 of 1429
Fault Tolerance Policy Assignment Checkpoint Optimization 30 of 1430
Fault Tolerance Policy Assignme 2 N 1 P 1(1)/1 N 1 P 1/2 P 1/3 P 1(1)/2 N 2 P 1(2) N 3 P 1(3) Re-execution Replication Re-executed replicas 31 of 1431
Re-execution vs. Replicatio Deadline P 2(1) P 1(2) N 2 Missed N 1 N 2 P 3(2) P 1(1) Re-execution is better P 1 N 1 P 2 P 3(1) P 2(2) P 3(2) Met Replication is better Met N 1 P 2 P 3 Missed N 2 P 3 N 2 P 2(1) P 1(2) bus m 1(1) m 1(2) bus P 2(2) P 3(1) m 2(2) P 1(1) m 1(2) N 1 Deadline bus P 1 m 1 A 1 P 2 P 3 N 1 N 2 P 1 40 50 P 2 40 50 P 3 60 70 1 A 2 P 1 m 1 P 2 m 2 P 3 32 of 1432
Fault Tolerance Policy Assignment Deadline m 2 P 1 m 1 P 1(2) P 4 PP 4 3(1) PP 2(2) 3 P 4(1) P 3(2) P 4(2) P 3 m 3 P 2 Met. Missed Optimization of fault tolerance policy assignment m 3(1) m 3(2) bus P 2 PP 2(1) 2 m 2 N 2 PP 1(1) 1 m m 2(1) 1(1) m 1(2) m 2(1) m 2(2) N 1 P 4 P 1 P 2 P 3 P 4 N 1 40 60 60 40 N 2 50 80 80 50 1 N 2 33 of 1433
Optimization Strategy § § Design optimization: § § Fault tolerance policy assignment Mapping of processes and messages Tabu-search § Root schedules Shifting-based scheduling Three tabu-search optimization algorithms: 1. Mapping and Fault Tolerance Policy assignment (MRX) § Re-execution, replication or both 2. Mapping and only Re-Execution (MX) 3. Mapping and only Replication (MR) 34 of 1434
Experimental Result Schedulability improvement under resource constraints Avgerage % deviation from MRX 100 90 Mapping and replication (MR) 80 70 60 50 40 30 Mapping and re-execution (MX) 20 10 0 Mapping and policy assignment (MRX) 20 40 60 80 100 Number of processes 35 of 1435
Checkpoint Optimizatio N 1 P 1 22 PP 1/1 1 2 P 1/2 22 P 1 36 of 1436
Locally Optimal Number of Checkpoi No. of checkpoints 1 2 k = 2 P 1 1 3 P 1 4 P 1 5 P 1 1 1 c 1 = 5 ms P 1 2 3 P 1 2 P 1 1 = 10 ms P 1 3 P 1 1 = 15 ms 4 P 1 5 P 1 C 1 = 50 ms 37 of 1437
Globally Optimal Number of Checkpoi 1 P 1 1 P 2 2 P 1 2 P 2 3 P 1 1 P 2 3 P 2 265 k = 2 P 1 m 1 P 2 P 1 P 2 255 c 10 5 10 P 1 P 2 C 1 = 50 ms C 2=60 ms 38 of 1438
Globally Optimal Number of Checkpoi a) b) 1 P 1 2 P 1 1 P 1 3 P 1 2 P 1 1 P 2 k = 2 P 1 m 1 P 2 P 1 P 2 c 10 5 10 2 P 2 3 P 2 265 255 2 P 1 P 2 C 1 = 50 ms C 2=60 ms 39 of 1439
Globally Optimal Number of Checkpoi a) b) 1 P 1 2 P 1 1 P 1 3 P 1 2 P 1 1 P 2 k = 2 P 1 m 1 P 2 P 1 P 2 c 10 5 10 2 P 2 3 P 2 265 255 2 P 1 P 2 C 1 = 50 ms C 2=60 ms 40 of 1440
% deviation from MC 0 (how smaller the fault tolerance overhead) Global Optimization vs. Local Optimizat 40% Does the optimization reduce the fault tolerance overheads on the schedule length? 30% 4 nodes, 3 faults 20% Global Optimization of Checkpoint Distribution (MC) 10% Local Optimization of Checkpoint Distribution (MC 0) 0% 40 60 80 100 Application size (the number of tasks) 41 of 1441
Trading-off Transparency for Performance Mapping Optimization with Transparency 42 of 1442
FT Implementations with Transparen – regular processes/messages – frozen processes/messages P 1 m 1 Frozen m 2 P 5 P 2 P 3 P 4 Transparency is achieved with frozen processes and messages Good for debugging and testing 43 of 1443
No Transparency processes start at different times P 1 N 2 P 4 m 1 m 2 bus N 1 no fault scenario P 2 P 1 messages are sent at different times P 3 m 3 N 1 P 1 the worst-case fault scenario P 2 N 2 m 3 bus m 1 m 2 P 4 N 1 Deadline P 1 P 2 P 3 P 4 N 1 30 20 X X P 4 N 2 X X 20 30 P 3 = 5 ms k = 2 m 3 P 2 P 3 m 2 P 1 m 1 P 4 44 of 1444
Customized Transparenc Full Transparency Deadline P 4 PP 44 P 3 m 1 m 2 no fault scenario No transparency P 2 P 3 m 3 P 2 P 1 m 3 P 11 Deadline Full transparency P 2 P 1 P 3 Customized transparency P 2 P 4 P 3 m 3 P 3 m 2 m 1 P 3 m 1 m 2 P 4 45 of 1445
Trading-Off Transparency for Performan increasing transparency 0% 25% 50% 75% 100% k=1 k=2 k=3 k=1 k=2 k=3 20 24 44 63 32 60 92 39 74 115 48 83 133 48 86 139 29 43 20 40 40 58 28 49 49 72 34 60 60 90 39 66 66 97 40 17 29 60 12 24 34 13 30 43 19 39 58 28 54 79 32 58 86 80 8 16 22 10 18 29 14 27 39 24 41 66 27 43 73 § How longer is the F Trading transparency for performance is essential Four (4) computation nodes schedule length with Recovery time 5 ms fault tolerance? 46 of 1446
Mapping with Transparenc Deadline N 1 P 1 bus N 1 P 4/1 bus P 1 P 5 optimal mapping without transparency P 6 P 4/2 P 3 P 4/3 P 2 P 5 the worst-case fault scenario for optimal mapping P 6 m 1 N 2 P 3 m 1 N 2 P 4 N 1 N 2 P 1 P 2 P 3 P 4 P 5 P 6 N 1 30 40 50 60 40 50 N 2 30 40 50 60 40 50 = 10 ms k = 2 m 1 P 1 m 2 P 3 m 3 P 5 P 4 m 4 P 6 47 of 1447
Mapping with Transparenc Deadline N 1 P 1 N 2 bus P 3 P 2/3 P 5 the worst-case fault scenario with transparency for “optimal” mapping P 6 m 1 bus P 2/2 P 2/1 P 2 P 4/1 P 3 P 5 P 4/2 the worst-case fault scenario with transparency and optimized mapping P 4/3 P 6 m 2 N 2 P 4 N 1 N 2 P 1 P 2 P 3 P 4 P 5 P 6 N 1 30 40 50 60 40 50 N 2 30 40 50 60 40 50 = 10 ms k = 2 m 1 P 1 m 2 P 3 m 3 P 5 P 4 m 4 P 6 48 of 1448
Design Optimizatio Hill-climbing mapping optimization heuristic Schedule length 1. Conditional Scheduling (CS) Slow 2. Schedule Length Estimation (SE) Fast 49 of 1449
Experimental Result 25% of processes and 50% of messages are frozen 4 nodes 15 applications k = 2 faults k = 3 faults Recovery overhead = 5 ms SE SE CS CS k = 4 faults SE CS 20 processes 0. 01 0. 07 0. 02 0. 28 0. 04 1. 37 30 processes 0. 13 0. 39 0. 19 2. 93 0. 26 31. 50 40 processes 0. 69 s 0. 32 1. 34 0. 50 17. 02 0. 69318. 88 s 318. 88 § Schedule length estimation (SE) is more How faster is schedule length estimation (SE) than 400 times faster than compared to conditional scheduling (CS)? conditional scheduling (CS) 50 of 1450
Experimental Result 4 computation nodes 15 applications Recovery overhead = 5 ms 25% of processes and 50% of messages are frozen k = 2 faults k = 3 faults k = 4 faults 20 processes 32. 89% 30 processes 35. 62% 40 processes 28. 88% 32. 20% 30. 56% 31. 68% 30. 58% 31. 68% 28. 11% 28. 03% Schedule length of § How much is the improvement when fault-tolerant applications is 31. 68% transparency is taken into account? shorter on average if transparency was considered during mapping 51 of 1451
Outline § Motivation § Background and limitations of previous work § Thesis contributions: § Scheduling with fault tolerance requirements § Fault tolerance policy assignment § Checkpoint optimization § Trading-off transparency for performance § Mapping optimization with transparency è Conclusions and future work 52 of 1452
Conclusions § Scheduling with fault tolerance requirements § Two novel scheduling techniques § Handling customized transparency requirements, trading-off transparency for performance § Fast scheduling alternative with low memory requirements for schedules 53 of 1453
Conclusions § Design optimization with fault tolerance § Policy assignment optimization strategy § Estimation-driven mapping optimization that can handle customized transparency requirements § Optimization of the number of checkpoints FApproaches and algorithms have been evaluated on the large number of synthetic applications and a real life example – vehicle cruise controller 54 of 1454
Design Optimization of Embedded Systems with Fault Tolerance is Essential 55 of 1455
Future Work Some More… Fault-Tree Analysis Probabilistic Fault Model Soft Real-Time 56 of 1456
- Licentiate thesis
- Dental licentiate
- Sjn scheduling
- Smallest anteroposterior diameter of the pelvic inlet
- Vertex presentation
- Apush thesis formula
- Mid term thesis presentation
- Thesis seminar presentation
- Thesis proposal presentation
- Architectural thesis presentation
- Bffc11
- Inventory management and production planning and scheduling
- Constrained and unconstrained optimization in economics
- Relative maximum and minimum
- "real system"
- Supply base rationalization and optimization
- Mippers
- Supply base rationalization definition
- Algorithms for query processing and optimization
- Optimization goals and figures of merit in wsn
- Engineering optimization methods and applications
- Linear optimization and prescriptive analysis
- Database performance tuning and query optimization
- Non preemptive scheduling
- Decentralized scheduling in nursing
- What are the impacts of resource constrained scheduling
- Sdm project management
- Difference between preemptive and nonpreemptive scheduling
- Forward scheduling
- Rms and edf scheduling example
- Asap and alap scheduling example
- Scheduling loading sequencing and monitoring
- Media scheduling based on flighting
- Principles of good routing and scheduling
- Production planning and detailed scheduling
- Scheduling and planning
- Project scheduling and tracking
- Flow graph
- Rms and edf scheduling example
- Scheduling resources and costs
- Advertising media planning and scheduling
- Csi 321
- Resource allocation and scheduling
- When the process issues an io request
- Disaggregate planning
- Forward and backward scheduling in sap sd
- Project scheduling and tracking software quality assurance
- Shape optimization ansys
- Optimize sterile supply workflow
- Bin collection optimization
- Matlab global optimization toolbox
- Divbar
- Sequential model based optimization
- Off page optimization tutorial
- Meta tags for search engine optimization
- Python supply chain optimization
- Sas marketing optimization