HW 2 Frequent Path Loop Invariant Code Motion

  • Slides: 19
Download presentation
HW 2 – Frequent Path Loop Invariant Code Motion Ze Zhang Oct 1, 2018

HW 2 – Frequent Path Loop Invariant Code Motion Ze Zhang Oct 1, 2018

Loop Invariant Code Motion (LICM) • Move operations whose source operands do not change

Loop Invariant Code Motion (LICM) • Move operations whose source operands do not change within the loop to the loop preheader – Execute them only 1 x per invocation of the loop – Be careful with memory operations! – Be careful with ops not executed every iteration for (int i = 0; i < n; i+ x = y + z; a[i] = 6 * i + x * x } • LICM code exists in LLVM! – /lib/Transforms/Scalar/LICM. cpp 1

Loop Invariant Code Motion (LICM) • Move operations whose source operands do not change

Loop Invariant Code Motion (LICM) • Move operations whose source operands do not change within the loop to the loop preheader – Execute them only 1 x per invocation of the loop – Be careful with memory operations! – Be careful with ops not executed every iteration • LICM code exists in LLVM! – /lib/Transforms/Scalar/LICM. cpp for (int i = 0; i < n; i+ x = y + z; a[i] = 6 * i + x * x } x = y + z; t 1 = x * x; for (int i = 0; i < n; i+ a[i] = 6 * i + t 1 } 2

Your Assignment: Frequent Path LICM r 1 = &A r 4 = load(r 1)

Your Assignment: Frequent Path LICM r 1 = &A r 4 = load(r 1) r 7 = r 4 * 3 r 3 = r 3 + r 5 100 1 r 2 = r 2 + 1 store (r 2, r 1) r 8 = r 2 + 7 store (r 3, r 8) 1 3

Your Assignment: Frequent Path LICM Cannot perform LICM on load, because of the store-load

Your Assignment: Frequent Path LICM Cannot perform LICM on load, because of the store-load dependency r 1 = &A r 4 = load(r 1) r 7 = r 4 * 3 r 3 = r 3 + r 5 100 1 r 2 = r 2 + 1 store (r 2, r 1) r 8 = r 2 + 7 store (r 3, r 8) 1 4

Your Assignment: Frequent Path LICM r 1 = &A Cannot perform LICM on load,

Your Assignment: Frequent Path LICM r 1 = &A Cannot perform LICM on load, because of the store-load dependency r 4 = load(r 1) r 7 = r 4 * 3 r 3 = r 3 + r 5 But… profile data says that the store rarely happens 100 1 r 2 = r 2 + 1 store (r 2, r 1) r 8 = r 2 + 7 store (r 3, r 8) 1 5

Your Assignment: Frequent Path LICM r 1 = &A Cannot perform LICM on load,

Your Assignment: Frequent Path LICM r 1 = &A Cannot perform LICM on load, because of the store-load dependency r 4 = load(r 1) r 7 = r 4 * 3 r 3 = r 3 + r 5 But… profile data says that the store rarely happens 100 1 r 2 = r 2 + 1 store (r 2, r 1) r 8 = r 2 + 7 store (r 3, r 8) 1 Frequent Path LICM: 1) Ignore infrequent dependence between loads and stores 6

Your Assignment: Frequent Path LICM r 1 = &A Cannot perform LICM on load,

Your Assignment: Frequent Path LICM r 1 = &A Cannot perform LICM on load, because of the store-load dependency r 4 = load(r 1) r 7 = r 4 * 3 r 3 = r 3 + r 5 But… profile data says that the store rarely happens 100 1 r 2 = r 2 + 1 store (r 2, r 1) r 8 = r 2 + 7 store (r 3, r 8) 1 Frequent Path LICM: 1) Ignore infrequent dependence between loads and stores 2) Perform LICM on load 7

Your Assignment: Frequent Path LICM r 1 = &A Cannot perform LICM on load,

Your Assignment: Frequent Path LICM r 1 = &A Cannot perform LICM on load, because of the store-load dependency r 4 = load(r 1) r 7 = r 4 * 3 r 3 = r 3 + r 5 But… profile data says that the store rarely happens 100 1 r 2 = r 2 + 1 store (r 2, r 1) r 8 = r 2 + 7 store (r 3, r 8) 1 Frequent Path LICM: 1) Ignore infrequent dependence between loads and stores 2) Perform LICM on load 3) Perform LICM on any consumers of the load that become invariant 8

Your Assignment: Frequent Path LICM r 1 = &A Cannot perform LICM on load,

Your Assignment: Frequent Path LICM r 1 = &A Cannot perform LICM on load, because of the store-load dependency r 4 = load(r 1) r 7 = r 4 * 3 r 3 = r 3 + r 5 But… profile data says that the store rarely happens 100 r 8 = r 2 + 7 store (r 3, r 8) 1 r 2 = r 2 + 1 store (r 2, r 1) r 4 = load(r 1) 1 r 7 = r 4 * 3 Frequent Path LICM: 1) Ignore infrequent dependence between loads and stores 2) Perform LICM on load 3) Perform LICM on any consumers of the load that become invariant 4) Insert fix-up code to restore correct execution 9

Your Assignment: Frequent Path LICM r 1 = &A r 4 = load(r 1)

Your Assignment: Frequent Path LICM r 1 = &A r 4 = load(r 1) r 7 = r 4 * 3 r 3 = r 3 + r 5 100 1 r 2 = r 2 + 1 store (r 2, r 1) r 8 = r 2 + 7 store (r 3, r 8) 1 Before FPLICM r 3 = r 3 + r 5 100 r 8 = r 2 + 7 store (r 3, r 8) 1 r 2 = r 2 + 1 store (r 2, r 1) r 4 = load(r 1) 1 r 7 = r 4 * 3 After FPLICM 10

Your Assignment: Frequent Path LICM • Identify the frequent path within the loop •

Your Assignment: Frequent Path LICM • Identify the frequent path within the loop • Find store instructions among all infrequent BBs and their dependent load instructions in frequent BBs destination operand of infrequent store = source operand of frequent load • Hoist the load instruction • Hoist consumers of the load that become invariant* • Replicate all hoisted instructions in the infrequent path 11

LLVM Code of Interest • The following slides present code from the LLVM codebase

LLVM Code of Interest • The following slides present code from the LLVM codebase that may help you with HW 2. • Disclaimers: – Use of this code is by no means required. There are many ways to do this assignment. – You are free to use any other code that exists in LLVM 6. 0. 1 or that you develop. – Read the documentation/source before asking for help! http: //llvm. org/docs/Programmers. Manual. html#helpfulhints-for-common-operations 12

Code: Manipulating Basic Blocks • Split. Block(…) splits a BB at a specified instr,

Code: Manipulating Basic Blocks • Split. Block(…) splits a BB at a specified instr, returns ptr to new BB that starts with the instr, connects the BBs with an unconditional branch • Split. Edge(…) will insert a BB between two specified BBs // I is an Instruction* Basic. Block *BB 1 = I->get. Parent(); Basic. Block *BB 3 = Split. Block(BB 1, I); Basic. Block *BB 2 = Split. Edge(BB 1, BB 3); • Code found in: – – <llvm-srcroot>/include/llvm/Transforms/Utils/Basic. B lock. Utils. h <llvm-srcroot>/lib/Transforms/Utils/Basic. Block. Utils. cpp 13

Code: Creating and Inserting Instructions • Various ways to create & insert instructions •

Code: Creating and Inserting Instructions • Various ways to create & insert instructions • Hint: Instructions have a clone() member function • See specific instruction constructors/member functions in: – <llvm-src-root>/include/llvm/IR/Instructions. h • See general instruction functions available to all instructions in: – <llvm-src-root>/include/llvm/IR/Instruction. h // 1) create load, insert at end of // specified basic block Load. Inst *LD = new Load. Inst(Val, “loadflag”, BB 1); // 2) create branch using Create // method, insert before BB 1’s // terminating instruction Branch: : Create(BB 1, BB 2, flag, BB 1 ->get. Terminator()); // 3) create a store inst that stores // result of LD to some variable // (related to next slide) Store. Inst *ST = new Store. Inst(LD, var); // inserting store into code ST->insert. After(LD); 14

Code: Creating Variables • Use Alloca. Inst to allocate space on the function’s stack

Code: Creating Variables • Use Alloca. Inst to allocate space on the function’s stack frame for a variable // 1) Create a variable in the // function Entry block Alloca. Inst *Val = new Alloca. Inst( I->get. Type(), 0, Entry->get. Terminator() ); // 2) store to the variable Store. Inst *ST = new Store. Inst( Result, Val, Entry->get. Terminator() ); 15

Important: Maintaining SSA Form • Static Single Assignment form requires unique destination registers for

Important: Maintaining SSA Form • Static Single Assignment form requires unique destination registers for each instruction – Replicated instructions in your infrequent BB will write to different regs compared to the instructions in the preheader! – Store results of hoisted instrs to stack variables (see prev. slide) – Make sure Alloca. Inst’s are in function’s entry BB! 16

Related Files • run. sh – List of commands used in HW 2 •

Related Files • run. sh – List of commands used in HW 2 • Project Template – HW 2 PASS. cpp: Mostly from current LLVM LICM Implementation. – run. On. Loop(…) hoist. Region(…) hoist(…) • Benchmarks – 6 correctness tests + README (Required) • Only need to hoist the dependent load instructions • Must generate the correct output after applying your FPLICM pass – 4 performance tests + README (Optional) • Hoist as many instructions as possible • Correctness first, then the performance 17

General Notes Regarding HW 2 • • Start early! Use the template (Don’t be

General Notes Regarding HW 2 • • Start early! Use the template (Don’t be afraid of it) Try the bonus part Running/Debugging – Revisit information from LLVM overview slides • Performance Competition: Generate correct AND fast bitcode 18