An Efficient Compiler Technique for Code Size Reduction

  • Slides: 28
Download presentation
An Efficient Compiler Technique for Code Size Reduction using Reduced Bit-width ISAs S. Ashok

An Efficient Compiler Technique for Code Size Reduction using Reduced Bit-width ISAs S. Ashok Halambi, Aviral Shrivastava, Partha Biswas, Nikil Dutt, Alex Nicolau Center for Embedded Computer Systems University of California, Irvine, USA

Outline • • • Introduction to r. ISA Challenges Problem definition Existing approach Our

Outline • • • Introduction to r. ISA Challenges Problem definition Existing approach Our approach Architectural Model for r. ISA Compiling for r. ISA Summary Future directions 2

Introduction • Code Size is a critical design factor for many Embedded Applications. •

Introduction • Code Size is a critical design factor for many Embedded Applications. • “reduced bit-width Instruction Set Architecture” is a promising architectural feature for code size reduction. • Support for a “reduced Bit-width Instruction Set”, along with normal IS. • Many contemporary processors use this feature • ARM 7 TDMI, MIPS, ST 100, ARC-Tangent. 3

reduced Bit-width Instruction Set • The “reduced Bit-width Instruction Set” along with the supporting

reduced Bit-width Instruction Set • The “reduced Bit-width Instruction Set” along with the supporting hardware is termed “reduced Bitwidth Instruction Set Architecture (r. ISA)”. • r. ISA Features • Instructions from both the IS reside in the memory. • r. IS are dynamically expanded to normal instructions before or during decode stage. • Execution of only normal instructions. 4

r. ISA • Most frequently occurring instructions are compressed to make reduced Bit-width Instruction

r. ISA • Most frequently occurring instructions are compressed to make reduced Bit-width Instruction Set. • Each r. ISA instruction maps to a unique normal instruction. • Simple and fast lookup table based “translator” logic. • Can be implemented without increasing cycle length or cycle penalty. • Achieve good code size reduction, without much architectural modification. • Best Case : 50 % code size reduction 5

Architectures supporting r. ISA • ARM 7 TDMI • • 32 -bit normal IS,

Architectures supporting r. ISA • ARM 7 TDMI • • 32 -bit normal IS, and 16 -bit r. IS. Switching between normal and r. ISA instructions is done by BX (Branch Exchange) instruction. – Basic block level granularity. • • Kwon et. al made each r. ISA instruction to write to a partition of register file. MIPS • • 32 -bit normal IS, and 16 -bit r. IS. Switching between normal and r. ISA instructions is done implicitly by code alignment. – Routine not aligned to word bounday r. ISA Instructions. – Routine level granularity. • ST 100 from STMicro and Tangent ARC core also support r. ISA 6

Bit-width Restrictions • Only a few instructions in r. IS. • Not all normal

Bit-width Restrictions • Only a few instructions in r. IS. • Not all normal instructions can be converted to r. ISA instructions. • 7 -bit opcodes in a 3 -address ARM Thumb instruction. • Operands of r. ISA instructions can access only a part of register file. • Code in terms of r. ISA instructions has high register pressure causing extra move/load/store instructions. • 3 -address instructions in ARM Thumb have accessibility to only 8 registers (out of 16). 7

Challenges in code generation 16 -bit r. ISA instruction format 7 -bit Fewer opcodes

Challenges in code generation 16 -bit r. ISA instruction format 7 -bit Fewer opcodes • 3 -bit Accessibility to only 8 registers Register pressure increases in the block which contains r. ISA instructions, resulting in • • • 3 -bit Increased code size because of spilling. Performance degradation. Estimating code size increase due to spilling, before register allocation is difficult. • A heuristic to estimate spill code because of r. ISA might be useful. 8

Problem Definition • Compile for r. ISA to achieve – • Maximum code size

Problem Definition • Compile for r. ISA to achieve – • Maximum code size reduction. • Least degradation in performance. 9

Existing Compilers for r. ISA • Work on routine level or basic-block level granularity.

Existing Compilers for r. ISA • Work on routine level or basic-block level granularity. • • Convert to reduced bit-width instructions only if all the instructions in the routine/basic-block have mappings to r. ISA instructions. Code generation for r. ISA is done as a postassembly pass or a pre-instruction selection pass. 10

Our Approach • r. ISA architectural model contains a mode exchange instruction to change

Our Approach • r. ISA architectural model contains a mode exchange instruction to change mode at an instruction level granularity. • Code generation for r. ISA is done as a part of instruction selection • • • Tightly coupled with the compiler flow. Use r. ISA instructions whenever profitable even within a function. We term the process of code generation for r. ISA, r. ISAization. 11

Advantage of Our Approach Existing approach Our approach Function 1 32 bit 16 bit

Advantage of Our Approach Existing approach Our approach Function 1 32 bit 16 bit Function 1 Function 2 Function 3 • Function level granularity • Instruction level granularity • Higher Code density 12

Architectural Model • • r. ISA instructions to normal instructions mapping. Explicit mode exchange

Architectural Model • • r. ISA instructions to normal instructions mapping. Explicit mode exchange instructions (mx and r. ISA_mx). • • Allow instruction level granularity for Conversion to r. ISA instructions. Useful r. ISA instructions: • • r. ISA_nop: To align the code to word boundary. r. ISA_move: To access all the registers in the register file and minimize spills in r. ISA code. r. ISA_extend: To increase the length of the immediate in the successive instruction. The bit-width restrictions for the above three r. ISA instructions are relaxed because they have lesser number of operands. 13

Compiling for r. ISA Source File C/C++ gcc Front End Generic Instruction Set 3

Compiling for r. ISA Source File C/C++ gcc Front End Generic Instruction Set 3 -address code Instruction Selection - I Profitability Analysis Augmented Instruction Set (with r. ISA Blocks) Instruction Selection - II Register Allocation Target Instruction Set (Normal + r. ISA) Assembly 14

Compiling for r. ISA – An Example Source File C/C++ gcc Front End Generic

Compiling for r. ISA – An Example Source File C/C++ gcc Front End Generic Instruction Set 3 -address code G_ADD GR 1 GR 2 4 G_MUL GR 3 GR 1 GR 2 G_ADD GR 4 GR 3 1 G_SUB GR 4 16 G_LI GR 4 200 G_ADD GR 5 GR 6 GR 7 G_MUL GR 9 GR 8 GR 6 G_ADD GR 10 GR 5 GR 9 G_SUB GR 11 GR 10 R 7 15

Compiling for r. ISA – An Example Source File C/C++ gcc Front End 1.

Compiling for r. ISA – An Example Source File C/C++ gcc Front End 1. Mark Instructions that can be converted to r. ISA instructions. Generic Instruction Set G_ADD GR 1 GR 2 4 3 -address code G_MUL GR 3 GR 1 GR 2 G_ADD GR 4 GR 3 1 Instruction Selection - I Augmented Instruction Set (with r. ISA Blocks) G_SUB GR 4 16 G_LI GR 4 200 G_ADD GR 5 GR 6 GR 7 G_MUL GR 9 GR 8 GR 6 Candidates for r. ISA instructions G_ADD GR 10 GR 5 GR 9 G_SUB GR 11 GR 10 GR 7 16

Compiling for r. ISA – An Example Source File C/C++ gcc Front End 2.

Compiling for r. ISA – An Example Source File C/C++ gcc Front End 2. Decide whether it is profitable to convert a r. ISA Block. Generic Instruction Set G_ADD GR 1 GR 2 4 3 -address code G_MUL GR 3 GR 1 GR 2 G_ADD GR 4 GR 3 1 Instruction Selection - I Profitability Analysis Augmented Instruction Set (with r. ISA Blocks) G_SUB GR 4 16 G_LI GR 4 200 G_ADD GR 5 GR 6 GR 7 G_MUL GR 9 GR 8 GR 6 G_ADD GR 10 GR 5 GR 9 G_SUB GR 11 GR 10 GR 7 17

Compiling for r. ISA – An Example Source File C/C++ gcc Front End 3.

Compiling for r. ISA – An Example Source File C/C++ gcc Front End 3. Replace marked instructions with r. ISA instructions. Generic Instruction Set T_ADD_R GR 1 GR 2 4 3 -address code T_MUL_R GR 3 GR 1 GR 2 T_ADD_R GR 4 GR 3 1 Instruction Selection - I Profitability Analysis Augmented Instruction Set (with r. ISA Blocks) Instruction Selection - II Target Instruction Set (Normal + r. ISA) T_SUB_R GR 4 16 T_MX_R T_LI GR 4 200 T_ADD GR 5 GR 6 GR 7 T_MUL GR 9 GR 8 GR 6 T_ADD GR 10 GR 5 GR 9 T_SUB GR 11 GR 10 GR 7 18

Compiling for r. ISA – An Example Source File C/C++ gcc Front End 4.

Compiling for r. ISA – An Example Source File C/C++ gcc Front End 4. Perform register allocation. Generic Instruction Set T_ADD_R TR 1 TR 2 4 3 -address code T_ADD_R TR 4 TR 3 1 Instruction Selection - I Profitability Analysis T_SUB_R TR 4 16 Augmented Instruction Set (with r. ISA Blocks) Instruction Selection - II Register Allocation T_MUL_R TR 3 TR 1 TR 2 Target Instruction Set (Normal + r. ISA) T_MX_R T_LI TR 4 200 T_ADD TR 5 TR 6 TR 7 T_MUL TR 9 TR 8 TR 6 T_ADD TR 10 TR 5 TR 9 T_SUB TR 11 TR 10 TR 7 Assembly 19

Compilation for r. ISA Source File C/C++ gcc Front End Generic Instruction Set 3

Compilation for r. ISA Source File C/C++ gcc Front End Generic Instruction Set 3 -address code 1. Mark Instructions that can be converted to r. ISA instructions. • Contiguous marked instructions form a “r. ISA Block”. Instruction Selection - I Profitability Analysis Generic Instruction Set (with r. ISA Blocks) Instruction Selection - II Register Allocation Target Instruction Set (Normal + r. ISA) 2. Decide whether it is profitable to convert a r. ISA Block. 3. Replace marked instructions with r. ISA instructions. 4. Perform register allocation. Assembly 20

Profitability Heuristic • Decides whether or not to convert a r. ISA Block to

Profitability Heuristic • Decides whether or not to convert a r. ISA Block to r. ISA Instructions. • Ideal decrease in code size – r. ISA_block_size(normal. Mode) – r. ISA_block_size(r. ISAMode) • Increase in code size – CS 1 : due to mode change instructions. – CS 2 : due to NOPs. – CS 3 : due to extra r. ISA load/store/move instructions. 21

Register Pressure Heuristic • Estimate the extra spill/load/move instructions. CS 3 = Spill/Reload code

Register Pressure Heuristic • Estimate the extra spill/load/move instructions. CS 3 = Spill/Reload code needed if block is converted to r. ISA Instructions – Spill/Reload code needed if block is converted to normal instructions • Spill code for a block is a function of • • • average register pressure number of instructions average live length 22

Spill Code Estimation • Estimate extra average register pressure: average register pressure – K

Spill Code Estimation • Estimate extra average register pressure: average register pressure – K 1*number of registers • Estimate the number of spills needed to reduce the register pressure by 1 for the block: number of instructions / average live length • Estimate number of spills: average extra register pressure * number of spills needed to reduce the register pressure by 1 23

Register Pressure Heuristic • Spill code if converted to r. ISA = (1) +

Register Pressure Heuristic • Spill code if converted to r. ISA = (1) + (2) (1) Estimated spill code for r. ISA variables in block number of available registers = r. ISA RF size (2) Estimated spill code for non-r. ISA variables in block. number of available registers = RF size – r. ISA RF size – average extra r. ISA register pressure • Spill code if converted to normal IS Estimated spill code for all variables in block number of available registers = RF size • Reload code is estimated as: K 2 * Spill code * average number of uses per variable definition 24

Experimental Set-up • Platform : MIPS 32/16 architecture • Benchmarks : Livermore loops •

Experimental Set-up • Platform : MIPS 32/16 architecture • Benchmarks : Livermore loops • Baseline Compiler: GCC for MIPS 32 and MIPS 16 optimized for code size • %age code size reduction in MIPS 16 over MIPS 32 • Our Compiler : Retargetable EXPRESS compiler for MIPS 32/16 • %age code size reduction • %age Performance degradation 25

Experiments § EXPRESS achieves 38% while GCC 14% average code size reduction. § Performance

Experiments § EXPRESS achieves 38% while GCC 14% average code size reduction. § Performance impact: average 6% (worst case: 24%) 26

Summary • r. ISA is an architectural feature that can potentially achieve huge code

Summary • r. ISA is an architectural feature that can potentially achieve huge code size reduction with minimal hardware alterations. • We presented a compiler technique to achieve code size reduction using r. ISA. • Ability to operate at instruction level granularity. • Integration of this technique in the compiler flow. • A heuristic to estimate the amount of spills/reloads/moves due to restricted availability of registers by some instructions. • On an average 38% improvement in code size. 27

Future directions • The profitability heuristic for code generation can be modified to account

Future directions • The profitability heuristic for code generation can be modified to account for the performance degradation due to r. ISA. • Design space exploration for choosing the best r. ISA suitable for a given embedded application. 28