Research in Compilers and Introduction to Loop Transformations

  • Slides: 28
Download presentation
Research in Compilers and Introduction to Loop Transformations Part I: Compiler Research Tomofumi Yuki

Research in Compilers and Introduction to Loop Transformations Part I: Compiler Research Tomofumi Yuki EJCP 2017 June 29, Toulouse

Background n Defended Ph. D. in C. S. on October 2012 n Colorado State

Background n Defended Ph. D. in C. S. on October 2012 n Colorado State University n Advisor: Dr. Sanjay Rajopadhye n Currently Inria Chargé de Recherche n Bretagne-Atlantique, Rennes, CAIRN team n Optimizing compiler + programming language n static analysis (polyhedral model) n parallel programming models n High-Level Synthesis EJCP 2017, June 29, Toulouse 2

What is this Course About? n Research in compilers n a bit about compiler

What is this Course About? n Research in compilers n a bit about compiler itself n Understand compiler research n what are the problems? n what techniques? Be ableare to the (partially) understand work n what are thepeople” applications? by “compiler at conferences. n may be do research in compilers later on! EJCP 2017, June 29, Toulouse 3

Compiler Advances n Old compiler vs recent compiler n modern architecture n gcc -O

Compiler Advances n Old compiler vs recent compiler n modern architecture n gcc -O 3 vs gcc -O 0 n How much speedup by compiler alone after 45 years of research? EJCP 2017, June 29, Toulouse 4

Proebsting’s Law n Compiler Advances Double Computing Power Every 18 Years n http: //proebsting.

Proebsting’s Law n Compiler Advances Double Computing Power Every 18 Years n http: //proebsting. cs. arizona. edu/law. html n Someone actually tried it: n On Proebsting’s Law, Kevin Scott, 2001 n SPEC 95, compared against –O 0 n 3. 3 x for int HW gives 60%/year n 8. 1 x for float EJCP 2017, June 29, Toulouse 5

Compiler Advances n Old compiler vs recent compiler n modern architecture n gcc -O

Compiler Advances n Old compiler vs recent compiler n modern architecture n gcc -O 3 vs gcc -O 0 n 3~8 x difference after 45 years n Not so much? EJCP 2017, June 29, Toulouse 6

Compiler Advances n Old compiler vs recent compiler n modern architecture n gcc -O

Compiler Advances n Old compiler vs recent compiler n modern architecture n gcc -O 3 vs gcc -O 0 n 3~8 x difference after 45 years n Not so much? “The most remarkable accomplishment by far of the compiler field is the widespread use of high-level languages. ” by Mary Hall, David Padua, and Keshav Pingali [Compiler Research: The Next 50 Years, 2009] EJCP 2017, June 29, Toulouse 7

Earlier Accomplishments n Getting efficient assembly n register allocation n instruction scheduling n. .

Earlier Accomplishments n Getting efficient assembly n register allocation n instruction scheduling n. . . n High-level language features n object-orientation n dynamic types n automated memory management n. . . EJCP 2017, June 29, Toulouse 8

What is Left? n Parallelism n multi-cores, GPUs, . . . n language features

What is Left? n Parallelism n multi-cores, GPUs, . . . n language features for parallelism n Security/Reliability n verification n certified compilers n Power/Energy n data movement n voltage scaling EJCP 2017, June 29, Toulouse 9

Agenda for today n Part I: What is Compiler Research? n Part II: Compiler

Agenda for today n Part I: What is Compiler Research? n Part II: Compiler Optimizations n Lab: Introduction to Loop Transformations EJCP 2017, June 29, Toulouse 10

What is a Compiler? n Bridge between “source” and “target” source compile target EJCP

What is a Compiler? n Bridge between “source” and “target” source compile target EJCP 2017, June 29, Toulouse 11

Compiler vs Assembler n What are the differences? source compile target assembl y assemble

Compiler vs Assembler n What are the differences? source compile target assembl y assemble object EJCP 2017, June 29, Toulouse 12

Compiler vs Assembler n Compiler n Many possible targets (semi-portable) n Many decisions are

Compiler vs Assembler n Compiler n Many possible targets (semi-portable) n Many decisions are taken n Assembler n Specialized output (non-portable) n Usually a “translation” EJCP 2017, June 29, Toulouse 13

Goals of the Compiler n Higher abstraction n No more writing assemblies! n enables

Goals of the Compiler n Higher abstraction n No more writing assemblies! n enables language features n loops, functions, classes, aspects, . . . n Performance n while increasing productivity n speed, space, energy, . . . n compiler optimizations EJCP 2017, June 29, Toulouse 14

Productivity vs Performance n Higher Abstraction ≈ Less Performance Abstraction Python Java C Fortran

Productivity vs Performance n Higher Abstraction ≈ Less Performance Abstraction Python Java C Fortran Assembly EJCP 2017, June 29, Toulouse Performanc e 15

Productivity vs Performance n How much can you regain? Abstraction Python Java C C

Productivity vs Performance n How much can you regain? Abstraction Python Java C C Fortran Assembly EJCP 2017, June 29, Toulouse Performanc e 16

Productivity vs Performance n How sloppy can you write code? Abstraction Python Java C

Productivity vs Performance n How sloppy can you write code? Abstraction Python Java C C Fortran Assembly EJCP 2017, June 29, Toulouse Performanc e 17

Compiler Research n Branch of Programming Languages n Program Analysis, Transformations n Formal Semantics

Compiler Research n Branch of Programming Languages n Program Analysis, Transformations n Formal Semantics n Type Theory n Runtime Systems n Compilers n. . . EJCP 2017, June 29, Toulouse 18

New HW Needs New Compiler n New Architecture n IBM Cell, GPU, Xeon-Phi, Kalray

New HW Needs New Compiler n New Architecture n IBM Cell, GPU, Xeon-Phi, Kalray MPPA, . . . n which ones succeeded? n Good prog. model and compiler n easier to fully utilize new HW n crucial for success n HW vendors invest a lot into compilers EJCP 2017, June 29, Toulouse 19

Examples n Two classical compiler optimizations n register allocation n instruction scheduling EJCP 2017,

Examples n Two classical compiler optimizations n register allocation n instruction scheduling EJCP 2017, June 29, Toulouse 20

Case 1: Register Allocation n Classical optimization problem 3 registers 8 instructions naïve translation

Case 1: Register Allocation n Classical optimization problem 3 registers 8 instructions naïve translation load %r 1, A load %r 2, B add %r 3, %r 1, %r 2 store %r 3, C load %r 1, B load %r 2, C add %r 3, %r 1, %r 2 store %r 3, D C = A + B; D = B + C; 2 registers 6 instructions smart compilation load %r 1, A load %r 2, B add %r 1, %r 2 store %r 1, C add %r 1, %r 2, %r 1 store %r 1, D EJCP 2017, June 29, Toulouse 21

Register Allocation in 5 min. n Often viewed as graph coloring a c Interference

Register Allocation in 5 min. n Often viewed as graph coloring a c Interference Graph a b d b c d c = a + b; d = b + c; add %r 1, %r 2, %r 1 Live Range Analysis n Live Range: when a value is “in use” n Interference: both values are “in use” n e. g. , two operands of an instruction n Coloring: conflicting nodes to different reg. EJCP 2017, June 29, Toulouse 22

Register Allocation in 5 min. n Registers are limited a b c d x

Register Allocation in 5 min. n Registers are limited a b c d x y a b c d x z a b c d x y c = a + b; d = b + c; x = c + d; y = a + x; Live Range Splitting a = load A; c = a + b; d = b + c; x = c + d; z = load A; y = z + x; z EJCP 2017, June 29, Toulouse 23

Research in Register Allocation n How to do a good allocation n which variables

Research in Register Allocation n How to do a good allocation n which variables to split n which values to spill “Solved” n How to do it fast? n Graph-coloring is expensive n Just-in-Time compilation EJCP 2017, June 29, Toulouse 24

Case 2: Instruction Scheduling n Another classical problem X = A * B *

Case 2: Instruction Scheduling n Another classical problem X = A * B * C; Y = D * E * F; naïve translation R = A * B; X = R * C; S = D * E; Y = S * F; smart compilation Pipeline Stall (if mult. takes 2 cycles) Also done in hardware (out-of-order) R = A * B; S = D * E; X = R * C; Y = S * F; EJCP 2017, June 29, Toulouse 25

Research in Instruction Scheduling n Not much anymore for speed/parallelism n beaten to death

Research in Instruction Scheduling n Not much anymore for speed/parallelism n beaten to death n hardware does it for you n Remains interesting in specific contexts n faster methods for JIT n energy optimization n “predictable” execution n in-order cores, VLIW, etc. EJCP 2017, June 29, Toulouse 27

Case 1+2: Phase Ordering n Yet another classical problem n practically no solution n

Case 1+2: Phase Ordering n Yet another classical problem n practically no solution n Given optimization A and B n A after B vs A before B n which order is better? n can you solve the problem globally? n Parallelism requires more memory n trade-off: register pressure vs parallelism EJCP 2017, June 29, Toulouse 28

Job Market n Where do they work at? n Intel / IBM Research n

Job Market n Where do they work at? n Intel / IBM Research n Apple n Mathworks n amazon n Xilinx n Many opportunities in France n Mathworks @ Grenoble n Many start-ups EJCP 2017, June 29, Toulouse 29