CS 201 Compiler Construction Instruction Scheduling Trace Scheduler

  • Slides: 25
Download presentation
CS 201 Compiler Construction Instruction Scheduling: Trace Scheduler 1

CS 201 Compiler Construction Instruction Scheduling: Trace Scheduler 1

Instruction Scheduling Modern processors can exploit Instruction Level Parallelism (ILP) by simultaneously executing multiple

Instruction Scheduling Modern processors can exploit Instruction Level Parallelism (ILP) by simultaneously executing multiple instructions. Instruction scheduling influences effectiveness with which ILP is exploited. Pipelined processors (e. g. , ARM): reordering of instructions avoids delays due hazards. EPIC/VLIW processors (e. g. Itanium): a single long instruction is packed with multiple operations (conventional instructions) that can be simultaneously executed. 2

Compiler Support Analyze dependences and rearrange the order of instructions, i. e. perform instruction

Compiler Support Analyze dependences and rearrange the order of instructions, i. e. perform instruction scheduling. Pipelined: limited amount of ILP is required -can be uncovered by reordering instructions within each basic block. EPIC/VLIW: much more ILP is required -can be uncovered by examining code from multiple basic blocks. 3

Compiler Support Two techniques that go beyond basic block boundaries to uncover ILP: (Acyclic

Compiler Support Two techniques that go beyond basic block boundaries to uncover ILP: (Acyclic Schedulers) Trace Scheduling: examines a trace – a sequence of basic blocks along an acyclic program path; instruction scheduling can result in movement of instructions across basic block boundaries. (Cyclic Schedulers) Software Pipelining: examines basic blocks corresponding to consecutive loop iterations; instruction scheduling can result in movement of instructions across loop iterations. 4

Trace Scheduling A trace is a sequence of basic blocks that does not extend

Trace Scheduling A trace is a sequence of basic blocks that does not extend across loop boundaries. • Select a trace • Determine the instruction schedule for the trace • Introduce compensation code to preserve program semantics • Repeat the above steps till some part of the program is yet to be scheduled 5

Trace Selection of traces is extremely important for overall performance – traces should represent

Trace Selection of traces is extremely important for overall performance – traces should represent paths that are executed frequently. A fast instruction schedule for one path is obtained at the expense of a slower schedule for the other path due to speculative code motion. 6

Picking Traces O – operation/instruction Count(o) – number of times o is expected to

Picking Traces O – operation/instruction Count(o) – number of times o is expected to be executed during an entire program run. Prob(e) – probability that an edge e will be executed -- important for conditional branches. Count(e) = Count(branch) x Prob(e) o Counts are estimated using profiling – measure counts by running the program on a representative input. 7

Algorithm for Trace Construction 1. Pick an operation with the largest execution count as

Algorithm for Trace Construction 1. Pick an operation with the largest execution count as the seed of the trace. 2. Grow the trace backward from the seed. 3. Grow the trace forward from the seed. Given that p is in the trace, include s in the trace iff: 1. Of all the edges leaving p, e has the largest execution count. 2. Of all the edges entering s, e has the highest execution count. Same approach taken to grow the trace backward. 8

Algorithm Contd. . Trace stops growing forward when: Count(e 1) < count(e 2) Premature

Algorithm Contd. . Trace stops growing forward when: Count(e 1) < count(e 2) Premature termination of trace can occur in the above algorithm. To prevent this, a slight modification is required. 9

Algorithm Contd. . Lets say A-B-C-D has been included in the current trace. Count(D-E)

Algorithm Contd. . Lets say A-B-C-D has been included in the current trace. Count(D-E) > Count(D-F) => add E Count(C-E) > Count(D-E) => do not add E 15 10 6 9 Premature termination occurs because the trace that can include C-E can no longer be formed because C is already in the current trace. Modification: consider only edges P-E st P is not already part of the current trace. 10

Algorithm Contd. . Trace cannot cross loop boundaries: • if the edge encountered is

Algorithm Contd. . Trace cannot cross loop boundaries: • if the edge encountered is a loop back edge; or • if edge enters/leaves a loop then stop growing the trace. 1 & 2 cannot be placed in the same trace because the edge directly connecting them is a loop back edge and edges indirectly connecting them cross loop boundaries. 11

Instruction Scheduling Construct a DAG for the selected trace. Generate an instruction schedule using

Instruction Scheduling Construct a DAG for the selected trace. Generate an instruction schedule using a scheduling heuristic: list scheduling with critical path first. Following generation of the instruction schedule introduction of compensation code may be required to preserve program semantics. 12

DAG and List Scheduling DAG – nodes are statements, edges dependences List Scheduling –

DAG and List Scheduling DAG – nodes are statements, edges dependences List Scheduling – critical (longest) path first m = x +4 m=x+4 i=j/2 If i<3 k=i+4 n = m+ 1 k live i=j/2 If i<3 n = m+ 1 k=i+4 i=j/2 m=x+4 n = m+ 1 If i<3 k=i+4 13

Compensation Code Consider movement of instructions across basic block boundaries, i. e. past splits

Compensation Code Consider movement of instructions across basic block boundaries, i. e. past splits and merges in the control flow graph. 1. Movement of a statement past/below a Split: i=n+1 If e k=j+4 k=i+1 compensation code If e i=n+1 k=j+4 k=i+1 i=n+1 14

Compensation Code Contd. . 2. Movement of a statement above a Join: c=a+b c

Compensation Code Contd. . 2. Movement of a statement above a Join: c=a+b c = a /2 d=c-2 i=i+1 c=a/2 c=a+b i=i+1 compensation code d=c-2 15

Compensation Code Contd. . 3. Movement of a statement above a Split: i=j+1 If

Compensation Code Contd. . 3. Movement of a statement above a Split: i=j+1 If e i=i+2 i=j+1 k=j+1 i=i+2 no compensation code If e k=j+1 No compensation code introduced – speculation. Note that i=i+2 can be moved above spilt if i is dead along the off-trace path. 16

Compensation Code Contd. . 4. Movement of a statement below a Join: i=i+1 c=a+b

Compensation Code Contd. . 4. Movement of a statement below a Join: i=i+1 c=a+b c = a /2 d=c-2 i=i+1 illegal to move unless i=i+1 is deadcode This case will not arise assuming dead code has been removed. 17

Compensation Code Contd. . 5. Movement of a branch below a split. If e

Compensation Code Contd. . 5. Movement of a branch below a split. If e 1 C i=i+1 If e 2 D If e 2 C If e 1 i=i+1 D 18

Compensation Code Contd. . 6. Movement of a branch above a join. i=j+1 C

Compensation Code Contd. . 6. Movement of a branch above a join. i=j+1 C If e x=y+z D 19

Negatives: Redundant Code A=B+C A=B+C 20

Negatives: Redundant Code A=B+C A=B+C 20

Negatives: Code Explosion C 1 B 1 Cn A 1 An Cn-1 C 2

Negatives: Code Explosion C 1 B 1 Cn A 1 An Cn-1 C 2 B 2 An-1 A 2 Cn Bn Order of instructions along the trace after scheduling C 1 An 21

Code Explosion Contd. . Cn C 1 A 1 C 2 A 2 Cn-1

Code Explosion Contd. . Cn C 1 A 1 C 2 A 2 Cn-1 An-1 Bn O(n 2) 1 step An Cn-1 C 1 A 1 Cn-2 An-2 Bn-1 Cn An An 1 C 2 B 2 C 3 A 2 C 1 A 1 B 1 C 2 A 2 C 3 A 3 Cn An O(nn) after processing the off-trace paths 1 trace of length n n more traces of length (n-1) created each trace will give rise to (n-1) traces of size (n-2) n + n(n-1)(n-2) + …. O(nn)

Building a DAG for Scheduling DAG contains the following edges: 1. Write-After-Read data dependence

Building a DAG for Scheduling DAG contains the following edges: 1. Write-After-Read data dependence 2. Write-After-Write data dependence 3. Read-After-Write data dependence 4. Write-after-conditional-read edge between IF e & x=c+d to x = a-b prevent movement of x=c+d above IF e. If e x=c+d z = x+1 23

Building a DAG Contd. . 5. Condition jumps: – – Introduce off-live edge between

Building a DAG Contd. . 5. Condition jumps: – – Introduce off-live edge between x=a-b and IF e. This edge does not constrain movement past IF e; it indicates that if x=a-b is moved past IF e then it can be eliminated from the trace but a copy must be placed along the off-trace path. x = a-b If e z = x+1 y = c+d 24

Sample Problem: Introduce Compensation Code A=B+C D=C+1 X=Y+1 If () A=B+C A=A+1 If ()

Sample Problem: Introduce Compensation Code A=B+C D=C+1 X=Y+1 If () A=B+C A=A+1 If () D=C+1 Z=D+1 P=Q+1 X=Y+1 S A=A+1 Z=D+1 ? ? ? S ? ? ? P=Q+1 25