Timing Analysis of Embedded Software for Speculative Processors
Timing Analysis of Embedded Software for Speculative Processors Tulika Mitra Abhik Roychoudhury Xianfeng Li School of Computing National University of Singapore ISSS'02
Why Timing Analysis? Timing guarantees for real time embedded sys n Real time scheduling: Worst case bound on execution time so that tasks are guaranteed to be schedulable irrespective of inputs n Tight bound to avoid idle processor cycles n Extremely important for safety critical systems n ISSS'02 2
Worst Case Execution Time (WCET) n n Given a program and a micro-architecture, estimate the maximum execution time of the program on the micro-architecture for all possible inputs Program path analysis [Shaw’ 89, Healy’ 98, . . ] n n All possible paths in control flow graph are not feasible Micro-architectural modeling n Dynamically variable instruction execution time due to n Cache, Pipeline [Li’ 99, Theiling’ 00, Schneider’ 99, . . ] n Speculative execution (branch prediction) ISSS'02 3
Speculative Execution n No Speculative Execution n Misprediction b N T S Misprediction penalty n Correct prediction ISSS'02 4
Impact of Speculative Execution Branch misprediction penalty can alter worst case execution path n Example: Insertion sort of 100 elements n Worst case path without speculation for input <100, 99, . . . , 2, 1> n Worst case path with speculation for input <99, 100, . . . , 2, 1> ISSS'02 5
Branch Prediction Schemes Scheme Static Feature WCET Work Static assignment of Chen et. al. prediction per branch 2001 Local Predict based on Colin & Puat outcome history of 2000 Dynamic this branch only Global Predict based on [Power. PC, outcome history of MIPS, neighboring branches AMD, Alpha] ISSS'02 6
Global Branch Prediction B 1: B 2: B 3: b = 0 if (a b = if (b 0 1 BHR outcome (B 3) = 1 if == 1) outcome 1; == 2) (B 1 B 2) = {01, 10, 11} 1; n Stores the outcomes of last n == 1) branches in a shift register, called Branch History Register (BHR) 0 n Index into the prediction 1 table using BHR 1 n Prediction table stores the 1 last outcome corresponding Prediction Table to that history n ISSS'02 7
Framework for Branch Prediction schemes differ in terms of the index into the prediction table Prediction Scheme Index Local Branch address Global: GAg BHR Global: gshare BHR Branch address Global: gselect {Branch address, BHR} ISSS'02 8
Modeling Difficulty Dynamic mapping: A branch can map to different entries in the prediction table n Aliasing: Different branches mapping to the same prediction table entry n Constructive/destructive Conflict n Conflicting branches with same/different outcomes n A single branch with same/different outcomes n ISSS'02 9
Our Technique: ILP Formulation Obtain linear constraints on total misprediction count for all possible inputs n Input: Control Flow Graph of the program n Objective function: n WCET = cost. B count. B + penalty misprediction. B ISSS'02 10
Flow Constraints: Easy !! Inflow = Basic Block Execution Count = Outflow Bound on maximum loop iterations start n U n blk 1 1 0 blk 2 1 0 end n n cs = ce = 1 es, 1 = 1 e 2, e + e 1, e = 1 es, 1 + e 2, 1 = e 1, 2 + e 1, e = c 1 e 1, 2 = e 2, e + e 2, 1= c 2 Loop bound: e 2, 1 <= 100 ISSS'02 11
Modeling Difficulty 1: Dynamic Mapping n n Identify possible patterns for each branch Static analysis of CFG for all possible patterns in branch history register (BHR) at node i ci , ei, j #exec. of node i, edge ei, j with BHR = m i #mispred. of node i with BHR = m i c i = c i ei, j = ei, j m i ci ISSS'02 12
Modeling Difficulty 1: Dynamic Mapping n Model flow of pattern among nodes and edges start ninflow: U blk 1 1 0 c 101 = e 2, 100 + e 2, 110 noutflow: c 101 = e 1, 201 + e 1, e 01 blk 2 1 0 end ISSS'02 13
Modeling Difficulty 2: Aliasing n Variable p i j : Number of times n occurs at node i followed by another occurrence at j n does not appear in the intermediate nodes i i O P j j ISSS'02 14
Modeling Difficulty 2: Aliasing s ps 3 3 p 3 6 p 3 8 p 8 3 6 p 6 e p 6 8 e ISSS'02 8 ∑ j p j i = ∑ j p i j = ci 15
Modeling Difficulty 3: Conflict ∑ j p , 0 j i n Case 1: Branch of block i with is taken n 1 0 i 0 n 1 ∑ j p , 1 i j Misprediction less than its total outflow under history & outcome of branch i taken ∑ j p , 1 i j Misprediction less than its total inflow under history & last outcome non-taken ∑ j p , 0 j i n Case 2: Branch of block i with is non-taken ISSS'02 16
Benchmarks Program Description check Negative number search of 100 -element array matsum Summation of two 100 x 100 matrices matmul Multiplication of two 10 x 10 matrices fft 1024 -point Fast Fourier Transform fdct Fast Discrete Cosine Transform isort Insertion sort of 100 -element array bsearch Binary search of 100 -element array eqntott Drawn from SPEC’ 92 integer benchmarks dhry Dhrystone benchmark ISSS'02 17
Modeling Accuracy Program check WCET Obs. Est. Misprediction Ratio Obs. Est. 611 1. 00 3 3 matsum 101, 417 1. 00 204 matmult 14, 732 1. 00 223 213, 052 223, 640 1. 05 3, 110 6, 865 fdct 2, 493 1. 00 7 7 isort 74, 225 74, 742 1. 01 9, 687 9, 954 bsearch 104 1. 00 9 9 eqntott 2, 311 2, 314 1. 00 203 204 122, 026 124, 297 1. 02 2, 207 2, 812 fft dhry ISSS'02 18
Summary Modeling dynamic control speculation for timing analysis of embedded code n Unified parameterized framework that can be instantiated for various prediction schemes n Tight execution time bound for benchmark programs under various prediction schemes and prediction table sizes n ISSS'02 19
- Slides: 19