Code Generation Code Generation The target machine Runtime

  • Slides: 94
Download presentation
Code Generation

Code Generation

Code Generation • • The target machine Runtime environment Basic blocks and flow graphs

Code Generation • • The target machine Runtime environment Basic blocks and flow graphs Instruction selection Instruction selector generator Register allocation Peephole optimization 2

The Target Machine • A byte addressable machine with four bytes to a word

The Target Machine • A byte addressable machine with four bytes to a word and n general purpose registers • Two address instructions – op source, destination • Six addressing modes – – – absolute register indexed ind register indexed literal M R c(R) *R *c(R) #c M R c+content(R) content(c+content(R)) c 1 0 1 13

Examples MOV MOV MOV R 0, M 4 (R 0), M *R 0, M

Examples MOV MOV MOV R 0, M 4 (R 0), M *R 0, M *4 (R 0), M #1, R 0 4

Instruction Costs • Cost of an instruction = 1 + costs of source and

Instruction Costs • Cost of an instruction = 1 + costs of source and destination addressing modes • This cost corresponds to the length (in words) of the instruction • Minimize instruction length also tend to minimize the instruction execution time 5

Examples MOV MOV R 0, R 1 R 0, M #1, R 0 4

Examples MOV MOV R 0, R 1 R 0, M #1, R 0 4 (R 0), *12 (R 1) 1 2 2 3 6

An Example Consider a : = b + c 1. MOV ADD MOV b,

An Example Consider a : = b + c 1. MOV ADD MOV b, R 0 c, R 0, a 3. R 0, R 1, R 2 contains the addresses of a, b, c MOV *R 1, *R 0 ADD *R 2, *R 0 2. MOV ADD b, a c, a 4. R 1, R 2 contains the values of b, c ADD R 2, R 1 MOV R 1, a 7

Instruction Selection • Code skeleton x : = y MOV ADD MOV + z

Instruction Selection • Code skeleton x : = y MOV ADD MOV + z y, R 0 z, R 0, x a : = b MOV ADD MOV + c b, R 0 c, R 0, a d : = a MOV ADD MOV + e a, R 0 e, R 0, d INC a • Multiple choices a : = a + 1 MOV a, R 0 ADD #1, R 0 MOV R 0, a 8

Register Allocation • Register allocation: select the set of variables that will reside in

Register Allocation • Register allocation: select the set of variables that will reside in registers • Register assignment: pick the specific register that a variable will reside in • The problem is NP-complete 9

An Example t : = a + b t : = t * c

An Example t : = a + b t : = t * c t : = t / d t : = a + b t : = t + c t : = t / d MOV ADD MUL DIV MOV ADD SRDA DIV MOV a, R 1 b, R 1 c, R 0 d, R 0 R 1, t a, R 0 b, R 0 c, R 0, 32 d, R 0 R 1, t 10

Runtime Environments • A translation needs to relate the static source text of a

Runtime Environments • A translation needs to relate the static source text of a program to the dynamic actions that must occur at runtime to implement the program • Essentially, the relationship between names and data objects • The runtime support system consists of routines that manage the allocation and deallocation of data objects 11

Activations • A procedure definition associates an identifier (name) with a statement (body) •

Activations • A procedure definition associates an identifier (name) with a statement (body) • Each execution of a procedure body is an activation of the procedure • An activation tree depicts the way control enters and leaves activations 12

An Example program sort (input, output); var a: array [0. . 10] of integer;

An Example program sort (input, output); var a: array [0. . 10] of integer; procedure readarray; var i: integer; begin for i : = 1 to 9 do read(a[i]) end; procedure partition(y, z: integer): integer; var i, j, x, v: integer; begin … end; procedure quicksort(m, n: integer); var i: integer; begin if (n > m) then begin I : = partition(m, n); quicksort(m, I-1); quicksort (I+1, n) end; begin a[0] : = -9999; a[10] : = 9999; readarray; quicksort(1, 9) 13 end.

An Example s r q(1, 9) p(1, 9) q(1, 3) p(1, 3) q(1, 0)

An Example s r q(1, 9) p(1, 9) q(1, 3) p(1, 3) q(1, 0) q(2, 3) q(5, 9) p(5, 9) q(5, 5) q(7, 9) p(2, 3) q(2, 1) q(3, 3) p(7, 9) q(7, 7) q(9, 9) 14

Scope • A declaration associates information with a name • Scope rules determine which

Scope • A declaration associates information with a name • Scope rules determine which declaration of a name applies • The portion of the program to which a declaration applies is called the scope of that declaration 15

Bindings of Names • The same name may denote different data objects (storage locations)

Bindings of Names • The same name may denote different data objects (storage locations) at runtime • An environment is a function that maps a name to a storage location • A state is a function that maps a storage location to the value held there environment name state storage location value 16

Static and Dynamic Notions 17

Static and Dynamic Notions 17

Storage Organization • • Target code: static Static data objects: static Dynamic data objects:

Storage Organization • • Target code: static Static data objects: static Dynamic data objects: heap Automatic data objects: stack code static data stack heap 18

Activation Records stack returned value actual parameters optional control link optional access link machine

Activation Records stack returned value actual parameters optional control link optional access link machine status local data temporary data 19

Activation Records returned value and parameters links and machine status local and temporary data

Activation Records returned value and parameters links and machine status local and temporary data frame pointer stack pointer returned value and parameters links and machine status local and temporary data 20

Declarations P {offset : = 0} D D D “; ” D D id

Declarations P {offset : = 0} D D D “; ” D D id “: ” T {enter(id. name, T. type, offset); offset : = offset + T. width} T integer {T. type : = integer; T. width : = 4} T float {T. type : = float; T. width : = 8} T array “[” num “]” of T 1 {T. type : = array(num. val, T 1. type); T. width : = num. val T 1. width} T “*” T 1 {T. type : = pointer(T 1. type); T. width : = 4} 21

Nested Procedures P D D D “; ” D | id “: ” T

Nested Procedures P D D D “; ” D | id “: ” T | proc id “; ” D “; ” S nil header a x readarray exchange quicksort header i header k v partition header i j 22

Symbol Table Handling • Operations – mktable(previous): creates a new table and returns a

Symbol Table Handling • Operations – mktable(previous): creates a new table and returns a pointer to the table – enter(table, name, type, offset): creates a new entry for name in the table – addwidth(table, width): records the cumulative width of entries in the header – enterproc(table, name, newtable): creates a new entry for procedure name in the table • Stacks – tblptr: pointers to symbol tables – offset : the next available relative address 23

Declarations P M D {addwidth(top(tblptr), top(offset)); pop(tblptr); pop(offset)} M {t : = mktable(nil); push(t,

Declarations P M D {addwidth(top(tblptr), top(offset)); pop(tblptr); pop(offset)} M {t : = mktable(nil); push(t, tblptr); push(0, offset)} D D “; ” D D proc id “; ” N D “; ” S {t : = top(tblptr); addwidth(t, top(offset)); pop(tblptr); pop(offset); enterproc(top(tblptr), id. name, t)} D id “: ” T {enter(top(tblptr), id. name, T. type, top(offset)); top(offset) : = top(offset) + T. width} N {t : = mktable(top(tblptr)); push(t, tblptr); push(0, offset)} 24

Records T record D end T record L D end {T. type : =

Records T record D end T record L D end {T. type : = record(top(tblptr)); T. width : = top(offset); pop(tblptr); pop(offset)} L {t : = mktable(nil); push(t, tblptr); push(0, offset)} 25

Basic Blocks • A basic block is a sequence of consecutive statements in which

Basic Blocks • A basic block is a sequence of consecutive statements in which control enters at the beginning and leaves at the end without halt or possibility of branching except at the end 26

An Example (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)

An Example (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) prod : = 0 i : = 1 t 1 : = 4 * i t 2 : = a[t 1] t 3 : = 4 * i t 4 : = b[t 3] t 5 : = t 2 * t 4 t 6 : = prod + t 5 prod : = t 6 t 7 : = i + 1 i : = t 7 if i <= 20 goto (3) 27

Control Flow Graphs • A (control) flow graph is a directed graph • The

Control Flow Graphs • A (control) flow graph is a directed graph • The nodes in the graph are basic blocks • There is an edge from B 1 to B 2 iff B 2 immediately follows B 1 in some execution sequence – there is a jump from B 1 to B 2 – B 2 immediately follows B 1 in program text • B 1 is a predecessor of B 2, B 2 is a successor of B 1 28

An Example (1) (2) prod : = 0 i : = 1 B 0

An Example (1) (2) prod : = 0 i : = 1 B 0 (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) t 1 : = 4 * i t 2 : = a[t 1] t 3 : = 4 * i t 4 : = b[t 3] t 5 : = t 2 * t 4 t 6 : = prod + t 5 prod : = t 6 t 7 : = i + 1 i : = t 7 if i <= 20 goto (3) B 1 29

Construction of Basic Blocks • Determine the set of leaders – the first statement

Construction of Basic Blocks • Determine the set of leaders – the first statement is a leader – the target of a jump is a leader – any statement immediately following a jump is a leader • For each leader, its basic block consists of the leader and all statements up to but not including the next leader or the end of the program 30

Representation of Basic Blocks • Each basic block is represented by a record consisting

Representation of Basic Blocks • Each basic block is represented by a record consisting of – a count of the number of statements – a pointer to the leader – a list of predecessors – a list of successors 31

DAG Representation of Blocks • Easy to determine: • common subexpressions • names used

DAG Representation of Blocks • Easy to determine: • common subexpressions • names used in the block but evaluated outside the block • names whose values could be used outside the block 32

DAG Representation of Blocks • Leaves labeled by unique identifiers • Interior nodes labeled

DAG Representation of Blocks • Leaves labeled by unique identifiers • Interior nodes labeled by operator symbols • Interior nodes optionally given a sequence of identifiers, having the value represented by the nodes 33

An Example (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) t 1

An Example (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) t 1 : = 4 * i t 6, prod t 2 : = a[t 1] + t 3 : = 4 * i t 5 t 4 : = b[t 3] prod 0 * t 5 : = t 2 * t 4 t 2 (1) t 6 : = prod + t 5 [] [] <= prod : = t 6 t 1, t 3 a b 20 t 7 : = i + 1 + * t 7, i i : = t 7 4 i 0 1 if i <= 20 goto (1) 34

Constructing a DAG • Consider x : = y op z. Other statements can

Constructing a DAG • Consider x : = y op z. Other statements can be handled similarly • If node(y) is undefined, create a leaf labeled y and let node(y) be this leaf. If node(z) is undefined, create a leaf labeled z and let node(z) be that leaf 35

Constructing a DAG • Determine if there is a node labeled op, whose left

Constructing a DAG • Determine if there is a node labeled op, whose left child is node(y) and its right child is node(z). If not, create such a node. Let n be the node found or created. • Delete x from the list of attached identifiers for node(x). Append x to the list of attached identifiers for the node n and set node(x) to n 36

Reconstructing Quadruples • Evaluate the interior nodes in topological order • Assign the evaluated

Reconstructing Quadruples • Evaluate the interior nodes in topological order • Assign the evaluated value to one of its attached identifier x, preferring one whose value is needed outside the block • If there is no attached identifier, create a new temp to hold the value • If there additional attached identifiers y 1, y 2, …, yk whose values are also needed outside the block, add y 1 : = x, y 2 : = x, …, yk : = x 37

An Example prod + prod 0 * (1) [] a [] b <= i

An Example prod + prod 0 * (1) [] a [] b <= i + * 4 i 0 20 (1) (2) (3) (4) (5) (6) (7) t 1 : = 4 * i t 2 : = a[t 1] t 3 : = b[t 1] t 4 : = t 2 * t 3 prod : = prod + t 4 i : = i + 1 if i <= 20 goto (1) 1 38

Generating Code From DAGs t 1 : = a + b t 2 :

Generating Code From DAGs t 1 : = a + b t 2 : = c + d t 3 : = e - t 2 t 4 : = t 1 - t 3 t 1 + a 0 - t 4 - b 0 e 0 t 3 + t 2 c 0 d 0 Only R 0 and R 1 available (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) MOV ADD MOV SUB MOV a, R 0 b, R 0 c, R 1 d, R 1 R 0, t 1 e, R 0 R 1, R 0 t 1, R 1 R 0, R 1, t 4 39

Rearranging the Order t 2 : = c + d t 3 : =

Rearranging the Order t 2 : = c + d t 3 : = e - t 2 t 1 : = a + b t 4 : = t 1 - t 3 t 1 + a 0 - t 4 - t 3 b 0 e 0 + c 0 t 2 (1) (2) (3) (4) (5) (6) (7) (8) MOV ADD MOV SUB MOV ADD SUB MOV c, R 0 d, R 0 e, R 1 R 0, R 1 a, R 0 b, R 0 R 1, R 0, t 4 d 0 40

A Heuristic Ordering for DAG • Attempt as far as possible to make the

A Heuristic Ordering for DAG • Attempt as far as possible to make the evaluation of a node immediately follow the evaluation of its left most argument 41

Node Listing Algorithm while unlisted interior nodes remain do begin select an unlisted node

Node Listing Algorithm while unlisted interior nodes remain do begin select an unlisted node n, all of whose parents have been listed; list n; while the leftmost child m of n has no unlisted parents and is not a leaf do begin list m; n : = m; end 42

An Example 2 6 a 0 * + 1 - 4 - 5 c

An Example 2 6 a 0 * + 1 - 4 - 5 c 0 + b 0 * + d 0 3 7 e 0 t 7 : = d + e t 6 : = a + b t 5 : = t 6 - c t 4 : = t 5 * t 7 t 3 : = t 4 - e t 2 : = t 6 + t 4 t 1 : = t 2 * t 3 43

Generating Code From Trees • There exists an algorithm that determines the optimal order

Generating Code From Trees • There exists an algorithm that determines the optimal order in which to evaluate statements in a block when the dag representation of the block is a tree • Optimal order here means the order that yields the shortest instruction sequence 44

Optimal Ordering for Trees • Label each node of the tree bottom-up with an

Optimal Ordering for Trees • Label each node of the tree bottom-up with an integer denoting fewest number of registers required to evaluate the tree with no stores of immediate results • Generate code during a tree traversal by first evaluating the operand requiring more registers 45

The Labeling Algorithm if n is a leaf then if n is the leftmost

The Labeling Algorithm if n is a leaf then if n is the leftmost child of its parent then label(n) : = 1 else label(n) : = 0 else begin let n 1, n 2, …, nk be the children of n ordered by label so that label(n 1) label(n 2) … label(nk); label(n) : = max 1 i k(label(ni) + i - 1) end 46

An Example For binary interior nodes: max(l 1, l 2), if l 1 l

An Example For binary interior nodes: max(l 1, l 2), if l 1 l 2 l 1 + 1, if l 1 = l 2 label(n) = 1 a t 1 2 t 4 1 0 b 1 2 t 3 1 t 2 e 1 c d 0 47

Code Generation From a Labeled Tree • Use a stack rstack to allocate registers

Code Generation From a Labeled Tree • Use a stack rstack to allocate registers R 0, R 1, …, R(r-1) • The value of a tree is always computed in the top register on rstack • The function swap(rstack) interchanges the top two registers on rstack • Use a stack to allocate temporary memory locations T 0, T 1, . . . 48

Cases Analysis op n n 1 name op n 1 n 2 name op

Cases Analysis op n n 1 name op n 1 n 2 name op n 2 n 1 n 2 label(n 1) < label(n 2) label(n 1) both labels r 49

The Function gencode procedure gencode(n); begin if n is a left leaf representing operand

The Function gencode procedure gencode(n); begin if n is a left leaf representing operand name and n is the leftmost child of its parent then print 'MOV' || name || ', ' || top(rstack) else if n is an interior node with operator op, left child n 1, and right child n 2 then if label(n 2) = 0 then /* case 1 */ else if 1 label(n 1) < label(n 2) and label(n 1) < r then /* case 2 */ else if 1 label(n 2) label(n 1) and label(n 2) < r then /* case 3 */ else /* case 4, both labels r */ end 50

The Function gencode /* case 1 */ begin let name be the operand represented

The Function gencode /* case 1 */ begin let name be the operand represented by n 2; gencode(n 1); print op || name || ', ' || top(rstack) end /* case 2 */ begin swap(rstack); gencode(n 2); R : = pop(rstack); gencode(n 1); print op || R || ', ' || top(rstack); push(rstack, R); swap(rstack); end 51

The Function gencode /* case 3 */ begin gencode(n 1); R : = pop(rstack);

The Function gencode /* case 3 */ begin gencode(n 1); R : = pop(rstack); gencode(n 2); print op || R || ', ' || top(rstack); push(rstack, R); end /* case 4 */ begin gencode(n 2); T : = pop(tstack); print 'MOV' || top(rstack) || ', ' || T; gencode(n 1); push(tstack, T); print op || T || ', ' || top(rstack); end 52

An Example 1 t 1 1 + 0 a b 2 t 4 1

An Example 1 t 1 1 + 0 a b 2 t 4 1 e 2 t 3 1 c 1 t 2 + d 0 gencode(t 4) [R 1, R 0] gencode(t 3) [R 0, R 1] gencode(e) [R 0, R 1] print MOV e, R 1 gencode(t 2) [R 0] gencode(c) [R 0] print MOV c, R 0 print ADD d, R 0 print SUB R 0, R 1 gencode(t 1) [R 0] gencode(a) [R 0] print MOV a, R 0 print ADD b, R 0 print SUB R 1, R 0 /* 2 */ /* 3 */ /* 0 */ /* 1 */ /* 0 */ 53

Common Subexpressions • Nodes with more than one parent in a dag are called

Common Subexpressions • Nodes with more than one parent in a dag are called shared nodes • Optimal code generation for dags on both a one-register machine or an unlimited number of registers machine are NP-complete 54

Partitioning a DAG into Trees • Partition a dag into a set of trees

Partitioning a DAG into Trees • Partition a dag into a set of trees by finding for each root and shared node n, the maximal subtree with n as root that includes no other shared nodes, except as leaves • Determine a code generation ordering for the trees • Generate code for each tree using the algorithms for generating code from trees 55

An Example 2 6 a 0 * + 2 1 - 4 - 5

An Example 2 6 a 0 * + 2 1 - 4 - 5 c 0 + b 0 * * + d 0 3 6 + + 7 6 4 * 5 e 0 1 4 * 4 e 0 * - + c 0 + 6 a 0 d 0 + b 0 3 7 e 0 56

Dynamic Programming Code Generation • The dynamic programming algorithm applies to a broad class

Dynamic Programming Code Generation • The dynamic programming algorithm applies to a broad class of register machines with complex instruction sets • Machines has r interchangeable registers • Machines has instructions of the form Ri = E where E is any expression containing operators, registers, and memory locations. If E involves registers, then Ri must be one of them 57

Dynamic Programming • The dynamic programming algorithm partitions the problem of generating optimal code

Dynamic Programming • The dynamic programming algorithm partitions the problem of generating optimal code for an expression into sub-problems of generating optimal code for the subexpressions of the given expression + T 1 T 2 58

Contiguous Evaluation • We say a program P evaluates a tree T contiguously if

Contiguous Evaluation • We say a program P evaluates a tree T contiguously if • it first evaluates those subtrees of T that need to be computed into memory • it then evaluates the subtrees of the root in either order • it finally evaluates the root 59

Optimally Contiguous Program • For the machines defined above, given any program P to

Optimally Contiguous Program • For the machines defined above, given any program P to evaluate an expression tree T, we can find an equivalent program P' such that – P' is of no higher cost than P – P' uses no more registers than P – P' evaluates the tree in a contiguous fashion • This implies that every expression tree can be evaluated optimally by a contiguous program 60

Dynamic Programming Algorithm • Phase 1: compute bottom-up for each node n of the

Dynamic Programming Algorithm • Phase 1: compute bottom-up for each node n of the expression tree T an array C of costs, in which the ith component C[i] is the optimal cost of computing the subtree S rooted at n into a register, assuming i registers are available for the computation. C[0] is the optimal cost of computing the subtree S into memory 61

Dynamic Programming Algorithm • To compute C[i] at node n, consider each machine instruction

Dynamic Programming Algorithm • To compute C[i] at node n, consider each machine instruction R : = E whose expression E matches the subexpression rooted at node n • Determine the costs of evaluating the operands of E by examining the cost vectors at the corresponding descendants of n 62

Dynamic Programming Algorithm • For those operands of E that are registers, consider all

Dynamic Programming Algorithm • For those operands of E that are registers, consider all possible orders in which the corresponding subtrees of T can be evaluated into registers • In each ordering, the first subtree corresponding to a register operand can be evaluated using i available registers, the second using i-1 registers, and so on 63

Dynamic Programming Algorithm • For node n, add in the cost of the instruction

Dynamic Programming Algorithm • For node n, add in the cost of the instruction R : = E that was used to match node n • The value C[i] is then the minimum cost over all possible orders • At each node, store the instruction used to achieve the best cost for C[i] for each i • The smallest cost in the vector gives the minimum cost of evaluating T 64

Dynamic Programming Algorithm • Phase 2: traverse T and use the cost vectors to

Dynamic Programming Algorithm • Phase 2: traverse T and use the cost vectors to determine which subtrees of T must be computed into memory • Phase 3: traverse T and use the cost vectors and associated instructions to generate the final target code 65

An Example Consider a machine with two registers R 0 and R 1 and

An Example Consider a machine with two registers R 0 and R 1 and instructions Ri : = Mj Mi : = Ri Ri : = Rj Ri : = Ri op Mj + (8, 8, 7) (3, 2, 2) (0, 1, 1) a b (0, 1, 1) (5, 5, 4) * (3, 2, 2) / e (0, 1, 1) c d (0, 1, 1) 66

An Example + (3, 2, 2) (0, 1, 1) a b (8, 8, 7)

An Example + (3, 2, 2) (0, 1, 1) a b (8, 8, 7) (0, 1, 1) R 0 : = c R 1 : = d R 1 : = R 1 / e R 0 : = R 0 * R 1 : = a R 1 : = R 1 - b R 1 : = R 1 + R 0 (5, 5, 4) * (3, 2, 2) / c (0, 1, 1) d e (0, 1, 1) 67

Code Generators • A tool to automatically construct the instruction selection phrase of a

Code Generators • A tool to automatically construct the instruction selection phrase of a code generator • Such tools may use tree grammars or context free grammars to describe the target machines • Register allocation will be implemented as a separate mechanism 68

Tree Rewriting a[i] : = b + 1 : = ind + + +

Tree Rewriting a[i] : = b + 1 : = ind + + + consta memb const 1 ind regsp consti + regsp 69

Tree Rewriting • The code is generated by reducing the input tree into a

Tree Rewriting • The code is generated by reducing the input tree into a single node using a sequence of tree-rewriting rules • Each tree rewriting rule is of the form replacement template { action } – replacement is a single node – template is a tree – action is a code fragment • A set of tree-rewriting rules is called a tree 70 translation scheme

An Example regi + regi { ADD Rj, Ri } regj Each tree template

An Example regi + regi { ADD Rj, Ri } regj Each tree template represents a computation performed by the sequence of machines instructions emitted by the associated action 71

Tree Rewriting Rules (1) regi constc { MOV #c, Ri } (2) regi mema

Tree Rewriting Rules (1) regi constc { MOV #c, Ri } (2) regi mema { MOV a, Ri } (3) mem : = mema regi : = mem ind regj { MOV Ri, a } (4) { MOV Rj, *Ri } regi ind (5) regi + constc regj { MOV c(Rj), Ri } 72

Tree Rewriting Rules + (6) regi ind { ADD c(Rj), Ri } + constc

Tree Rewriting Rules + (6) regi ind { ADD c(Rj), Ri } + constc regj (7) regi (8) regi + regi { ADD Rj, Ri } regj + regi const 1 { INC Ri } 73

An Example : = ind + + + consta (1) { MOV #a, R

An Example : = ind + + + consta (1) { MOV #a, R 0 } memb const 1 ind regsp consti + regsp 74

An Example : = ind + + + reg 0 memb const 1 ind

An Example : = ind + + + reg 0 memb const 1 ind regsp (7) consti { ADD SP, R 0 } + regsp 75

An Example : = { ADD i (SP), R 0 } ind + +

An Example : = { ADD i (SP), R 0 } ind + + reg 0 memb ind + (5) (6) const 1 consti { MOV i (SP), R 1 } regsp 76

An Example : = ind reg 0 + memb const 1 (2) { MOV

An Example : = ind reg 0 + memb const 1 (2) { MOV b, R 1 } 77

An Example : = ind reg 0 + reg 1 const 1 (8) {

An Example : = ind reg 0 + reg 1 const 1 (8) { INC R 1 } 78

An Example : = ind reg 1 reg 0 (4) { MOV R 1,

An Example : = ind reg 1 reg 0 (4) { MOV R 1, *R 0 } 79

Tree Pattern Matching • The tree pattern matching algorithm can be implemented by extending

Tree Pattern Matching • The tree pattern matching algorithm can be implemented by extending the multiplekeyword pattern matching algorithm • Each tree template is represented by a set of strings, each of which represents a path from the root to a leave • Each rule is associated with cost information • The dynamic programming algorithm can be used to select an optimal sequence of matches 80

Semantic Predicates regi + regi constc { if c = 1 then INC Ri

Semantic Predicates regi + regi constc { if c = 1 then INC Ri else ADD #c, Ri } The general use of semantic actions and predicates can provide greater flexibility and ease of description than a purely grammatical specification 81

Graph Coloring • In the first pass, target machine instructions are selected as though

Graph Coloring • In the first pass, target machine instructions are selected as though there were an infinite number of symbolic registers • In the second pass, physical registers are assigned to symbolic registers using graph coloring algorithms • During the second pass, if a register is needed when all available registers are used, some of the used registers must be spilled 82

Interference Graph • For each procedure, a register-interference graph is constructed • The nodes

Interference Graph • For each procedure, a register-interference graph is constructed • The nodes in the graph are symbolic registers • An edge connects two nodes if one is live at a point where the other is defined 83

K-Colorable Graphs • A graph is said to be k-colorable if each node can

K-Colorable Graphs • A graph is said to be k-colorable if each node can be assigned one of the k colors such that no two adjacent nodes have the same color • A color represents a register • The problem of determining whether a graph is k-colorable is NP-complete 84

A Graph Coloring Algorithm • Remove a node n and its edges if it

A Graph Coloring Algorithm • Remove a node n and its edges if it has fewer than k neighbors • Repeat the removing step above until we end up with the empty graph or a graph in which each node has k or more adjacent nodes • In the latter case, a node is selected and spilled by deleting that node and its edges, and the removing step above continues 85

A Graph Coloring Algorithm • The nodes in the graph can be colored in

A Graph Coloring Algorithm • The nodes in the graph can be colored in the reverse order in which they are removed • Each node can be assigned a color not assigned to any of its neighbors • Spilled nodes can be assigned any color 86

An Example 3 1 3 4 5 2 2 3 4 5 5 87

An Example 3 1 3 4 5 2 2 3 4 5 5 87

An Example B G R G R R R B G R R 88

An Example B G R G R R R B G R R 88

Peephole Optimization • Improve the performance of the target program by examining and transforming

Peephole Optimization • Improve the performance of the target program by examining and transforming a short sequence of target instructions • May need repeated passes over the code • Can also be applied directly after intermediate code generation 89

Examples • Redundant loads and stores MOV R 0, a MOV a, Ro •

Examples • Redundant loads and stores MOV R 0, a MOV a, Ro • Algebraic Simplification x : = x + 0 x : = x * 1 • Constant folding x : = 2 + 3 y : = x + 3 x : = 5 y : = 8 90

Examples • Unreachable code #define debug 0 if (debug) (print debugging information) if 0

Examples • Unreachable code #define debug 0 if (debug) (print debugging information) if 0 <> 1 goto L 1 print debugging information L 1: if 1 goto L 1 print debugging information L 1: 91

Examples • Flow-of-control optimization goto L 1 … L 1: goto L 2 goto

Examples • Flow-of-control optimization goto L 1 … L 1: goto L 2 goto L 1 … L 1: if a < b goto L 2 L 3: goto L 2 … L 2: goto L 2 if a < b goto L 2 goto L 3 … L 3: 92

Examples • Reduction in strength: replace expensive operations by cheaper ones – x 2

Examples • Reduction in strength: replace expensive operations by cheaper ones – x 2 x * x – fixed-point multiplication and division by a power of 2 shift – floating-point division by a constant floating -point multiplication by a constant 93

Examples • Use of machine Idioms: hardware instructions for certain specific operations – auto-increment

Examples • Use of machine Idioms: hardware instructions for certain specific operations – auto-increment and auto-decrement addressing mode (push or pop stack in parameter passing) 94