Machine Programming Introduction CENG 331 Introduction to Computer

  • Slides: 102
Download presentation
Machine Programming - Introduction CENG 331: Introduction to Computer Systems 4 th Lecture Instructor:

Machine Programming - Introduction CENG 331: Introduction to Computer Systems 4 th Lecture Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted from the ones prepared by R. E. Bryant, D. R. O’Hallaron of Carnegie-Mellon Univ.

Machine Programming I: Basics ¢ ¢ ¢ History of Intel processors and architectures C,

Machine Programming I: Basics ¢ ¢ ¢ History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move

Intel x 86 Processors ¢ Totally dominate computer market ¢ Evolutionary design § Backwards

Intel x 86 Processors ¢ Totally dominate computer market ¢ Evolutionary design § Backwards compatible up until 8086, introduced in 1978 § Added more features as time goes on ¢ Complex instruction set computer (CISC) § Many different instructions with many different formats But, only small subset encountered with Linux programs § Hard to match performance of Reduced Instruction Set Computers (RISC) § But, Intel has done just that! §

Intel x 86 Evolution: Milestones Name ¢ 8086 Date 1978 Transistors 29 K MHz

Intel x 86 Evolution: Milestones Name ¢ 8086 Date 1978 Transistors 29 K MHz 5 -10 § First 16 -bit processor. Basis for IBM PC & DOS § 1 MB address space ¢ 386 § § ¢ 1985 275 K 16 -33 First 32 bit processor , referred to as IA 32 Added “flat addressing” Capable of running Unix 32 -bit Linux/gcc uses no instructions introduced in later models Pentium 4 F 2004 125 M 2800 -3800 § First 64 -bit processor, referred to as x 86 -64 ¢ Core i 7 2008 731 M 2667 -3333

Intel x 86 Processors: Overview Architectures X 86 -16 Processors 8086 286 X 86

Intel x 86 Processors: Overview Architectures X 86 -16 Processors 8086 286 X 86 -32/IA 32 MMX 386 486 Pentium MMX SSE Pentium III SSE 2 Pentium 4 SSE 3 Pentium 4 E X 86 -64 / EM 64 t Pentium 4 F SSE 4 Core 2 Duo Core i 7 time IA: often redefined as latest Intel architecture

Intel x 86 Processors, contd. Machine Evolution § 486 1989 1. 9 M §

Intel x 86 Processors, contd. Machine Evolution § 486 1989 1. 9 M § Pentium 1993 3. 1 M § Pentium/MMX 1997 4. 5 M § Pentium. Pro 1995 6. 5 M § Pentium III 1999 8. 2 M § Pentium 4 2001 42 M § Core 2 Duo 2006 291 M § Core i 7 2008 731 M ¢ Added Features § Instructions to support multimedia operations ¢ Parallel operations on 1, 2, and 4 -byte data, both integer & FP § Instructions to enable more efficient conditional operations § ¢ Linux/GCC Evolution § Two major steps: 1) support 32 -bit 386. 2) support 64 -bit x 86 -64

More Information ¢ ¢ Intel processors (Wikipedia) Intel microarchitectures

More Information ¢ ¢ Intel processors (Wikipedia) Intel microarchitectures

New Species: ia 64, then IPF, then Itanium, … Name ¢ Itanium Date 2001

New Species: ia 64, then IPF, then Itanium, … Name ¢ Itanium Date 2001 Transistors 10 M § First shot at 64 -bit architecture: first called IA 64 § Radically new instruction set designed for high performance § Can run existing IA 32 programs On-board “x 86 engine” § Joint project with Hewlett-Packard § ¢ Itanium 2 2002 221 M § Big performance boost Itanium 2 Dual-Core 2006 1. 7 B ¢ Itanium has not taken off in marketplace ¢ § Lack of backward compatibility, no good compiler support, Pentium 4 got too good

x 86 Clones: Advanced Micro Devices (AMD) ¢ Historically § AMD has followed just

x 86 Clones: Advanced Micro Devices (AMD) ¢ Historically § AMD has followed just behind Intel § A little bit slower, a lot cheaper ¢ Then § Recruited top circuit designers from Digital Equipment Corp. and other downward trending companies § Built Opteron: tough competitor to Pentium 4 § Developed x 86 -64, their own extension to 64 bits ¢ Recently § Intel much quicker with dual core design § Intel currently far ahead in performance § em 64 t backwards compatible to x 86 -64

Intel’s 64 -Bit ¢ Intel Attempted Radical Shift from IA 32 to IA 64

Intel’s 64 -Bit ¢ Intel Attempted Radical Shift from IA 32 to IA 64 § Totally different architecture (Itanium) § Executes IA 32 code only as legacy § Performance disappointing ¢ AMD Stepped in with Evolutionary Solution § x 86 -64 (now called “AMD 64”) ¢ Intel Felt Obligated to Focus on IA 64 § Hard to admit mistake or that AMD is better ¢ 2004: Intel Announces EM 64 T extension to IA 32 § Extended Memory 64 -bit Technology § Almost identical to x 86 -64! ¢ Meanwhile: EM 64 t well introduced, however, still often not used by OS, programs

Our Coverage ¢ IA 32 § The traditional x 86 ¢ x 86 -64/EM

Our Coverage ¢ IA 32 § The traditional x 86 ¢ x 86 -64/EM 64 T § The emerging standard ¢ Presentation § Book has IA 32 and x 86 -64

Machine Programming I: Basics ¢ ¢ ¢ History of Intel processors and architectures C,

Machine Programming I: Basics ¢ ¢ ¢ History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move

Definitions ¢ ¢ ¢ Instruction Set Architecture (ISA): The parts of a processor design

Definitions ¢ ¢ ¢ Instruction Set Architecture (ISA): The parts of a processor design that one needs to understand to write assembly code. Microarchitecture: Implementation of the architecture. Instruction Set Architecture examples: instruction set specification, registers. Microarchitecture examples: cache sizes and core frequency. Example ISAs (Intel): x 86, IA, IPF

Assembly Programmer’s View CPU PC Registers Condition Codes ¢ Memory Addresses Data Instructions Object

Assembly Programmer’s View CPU PC Registers Condition Codes ¢ Memory Addresses Data Instructions Object Code Program Data OS Data Stack Programmer-Visible State § PC: Program counter Address of next instruction § Called “EIP” (IA 32) or “RIP” (x 86 -64) § § Register file § Heavily used program data § Condition codes Store status information about most recent arithmetic operation § Used for conditional branching § § Memory Byte addressable array § Code, user data, (some) OS data § Includes stack used to support procedures §

Turning C into Object Code § Code in files p 1. c p 2.

Turning C into Object Code § Code in files p 1. c p 2. c § Compile with command: gcc -O p 1. c p 2. c -o p Use optimizations (-O) § Put resulting binary in file p § text C program (p 1. c p 2. c) Compiler (gcc -S) text Asm program (p 1. s p 2. s) Assembler (gcc or as) binary Object program (p 1. o p 2. o) Linker (gcc or ld) binary Executable program (p) Static libraries (. a)

Compiling Into Assembly C Code int sum(int x, int y) { int t =

Compiling Into Assembly C Code int sum(int x, int y) { int t = x+y; return t; } Generated IA 32 Assembly sum: pushl %ebp movl %esp, %ebp movl 12(%ebp), %eax addl 8(%ebp), %eax movl %ebp, %esp popl %ebp ret Obtain with command gcc -O -S code. c Produces file code. s Some compilers use single instruction “leave”

Assembly Characteristics: Data Types ¢ “Integer” data of 1, 2, or 4 bytes §

Assembly Characteristics: Data Types ¢ “Integer” data of 1, 2, or 4 bytes § Data values § Addresses (untyped pointers) ¢ Floating point data of 4, 8, or 10 bytes ¢ No aggregate types such as arrays or structures § Just contiguously allocated bytes in memory

Assembly Characteristics: Operations ¢ Perform arithmetic function on register or memory data ¢ Transfer

Assembly Characteristics: Operations ¢ Perform arithmetic function on register or memory data ¢ Transfer data between memory and register § Load data from memory into register § Store register data into memory ¢ Transfer control § Unconditional jumps to/from procedures § Conditional branches

Object Code for sum ¢ 0 x 401040 <sum>: 0 x 55 0 x

Object Code for sum ¢ 0 x 401040 <sum>: 0 x 55 0 x 89 0 xe 5 0 x 8 b 0 x 45 0 x 0 c 0 x 03 ¢ 0 x 45 0 x 08 • Total of 13 bytes 0 x 89 0 xec • Each instruction 1, 2, or 3 bytes 0 x 5 d 0 xc 3 • Starts at address 0 x 401040 Assembler § § Translates. s into. o Binary encoding of each instruction Nearly-complete image of executable code Missing linkages between code in different files Linker § Resolves references between files § Combines with static run-time libraries E. g. , code for malloc, printf § Some libraries are dynamically linked § Linking occurs when program begins execution §

Machine Instruction Example ¢ int t = x+y; § Add two signed integers ¢

Machine Instruction Example ¢ int t = x+y; § Add two signed integers ¢ “Long” words in GCC parlance § Same instruction whether signed or unsigned § Operands: x: Register %eax y: Memory M[%ebp+8] t: Register %eax – Return function value in %eax § Similar to expression: x += y More precisely: int eax; int *ebp; eax += ebp[2] 03 45 08 Assembly § Add 2 4 -byte integers addl 8(%ebp), %eax 0 x 401046: C Code ¢ Object Code § 3 -byte instruction § Stored at address 0 x 401046

Disassembling Object Code Disassembled 00401040 <_sum>: 0: 55 1: 89 e 5 3: 8

Disassembling Object Code Disassembled 00401040 <_sum>: 0: 55 1: 89 e 5 3: 8 b 45 0 c 6: 03 45 08 9: 89 ec b: 5 d c: c 3 d: 8 d 76 00 ¢ push %ebp mov %esp, %ebp mov 0 xc(%ebp), %eax add 0 x 8(%ebp), %eax mov %ebp, %esp pop %ebp ret lea 0 x 0(%esi), %esi Disassembler objdump -d p § Useful tool for examining object code § Analyzes bit pattern of series of instructions § Produces approximate rendition of assembly code § Can be run on either a. out (complete executable) or. o file

Alternate Disassembly Disassembled Object 0 x 401040: 0 x 55 0 x 89 0

Alternate Disassembly Disassembled Object 0 x 401040: 0 x 55 0 x 89 0 xe 5 0 x 8 b 0 x 45 0 x 0 c 0 x 03 0 x 45 0 x 08 0 x 89 0 xec 0 x 5 d 0 xc 3 0 x 401040 <sum>: 0 x 401041 <sum+1>: 0 x 401043 <sum+3>: 0 x 401046 <sum+6>: 0 x 401049 <sum+9>: 0 x 40104 b <sum+11>: 0 x 40104 c <sum+12>: 0 x 40104 d <sum+13>: ¢ push %ebp mov %esp, %ebp mov 0 xc(%ebp), %eax add 0 x 8(%ebp), %eax mov %ebp, %esp pop %ebp ret lea 0 x 0(%esi), %esi Within gdb Debugger gdb p disassemble sum § Disassemble procedure x/13 b sum § Examine the 13 bytes starting at sum

What Can be Disassembled? % objdump -d WINWORD. EXE: file format pei-i 386 No

What Can be Disassembled? % objdump -d WINWORD. EXE: file format pei-i 386 No symbols in "WINWORD. EXE". Disassembly of section. text: 30001000 <. text>: 30001000: 55 30001001: 8 b ec 30001003: 6 a ff 30001005: 68 90 10 00 30 3000100 a: 68 91 dc 4 c 30 ¢ ¢ push %ebp mov %esp, %ebp push $0 xffff push $0 x 30001090 push $0 x 304 cdc 91 Anything that can be interpreted as executable code Disassembler examines bytes and reconstructs assembly source

Machine Programming I: Basics ¢ ¢ ¢ History of Intel processors and architectures C,

Machine Programming I: Basics ¢ ¢ ¢ History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move

general purpose Integer Registers (IA 32) Origin (mostly obsolete) %eax %ah %al accumulate %ecx

general purpose Integer Registers (IA 32) Origin (mostly obsolete) %eax %ah %al accumulate %ecx %ch %cl counter %edx %dh %dl data %ebx %bh %bl base %esi %si source index %edi %di destination index %esp %ebp %bp stack pointer base pointer 16 -bit virtual registers (backwards compatibility)

Moving Data: IA 32 ¢ Moving Data § movx Source, Dest § x in

Moving Data: IA 32 ¢ Moving Data § movx Source, Dest § x in {b, w, l} § movl Source, Dest: Move 4 -byte “long word” § movw Source, Dest: Move 2 -byte “word” § movb Source, Dest: Move 1 -byte “byte” ¢ Lots of these in typical code %eax %ecx %edx %ebx %esi %edi %esp %ebp

Moving Data: IA 32 ¢ Moving Data movl Source, Dest: ¢ Operand Types §

Moving Data: IA 32 ¢ Moving Data movl Source, Dest: ¢ Operand Types § Immediate: Constant integer data %eax %ecx %edx %ebx %esi Example: $0 x 400, $-533 %edi § Like C constant, but prefixed with ‘$’ %esp § Encoded with 1, 2, or 4 bytes %ebp § Register: One of 8 integer registers § Example: %eax, %edx § But %esp and %ebp reserved for special use § Others have special uses for particular instructions § Memory: 4 consecutive bytes of memory at address given by register § Simplest example: (%eax) § Various other “address modes” §

movl Operand Combinations Source movl Dest Src, Dest C Analog Imm Reg movl $0

movl Operand Combinations Source movl Dest Src, Dest C Analog Imm Reg movl $0 x 4, %eax Mem movl $-147, (%eax) temp = 0 x 4; Reg movl %eax, %edx Mem movl %eax, (%edx) temp 2 = temp 1; Mem Reg temp = *p; movl (%eax), %edx *p = -147; *p = temp; Cannot do memory-memory transfer with a single instruction

Simple Memory Addressing Modes ¢ Normal (R) Mem[Reg[R]] § Register R specifies memory address

Simple Memory Addressing Modes ¢ Normal (R) Mem[Reg[R]] § Register R specifies memory address movl (%ecx), %eax ¢ Displacement D(R) Mem[Reg[R]+D] § Register R specifies start of memory region § Constant displacement D specifies offset movl 8(%ebp), %edx

Using Simple Addressing Modes swap: pushl %ebp movl %esp, %ebp pushl %ebx void swap(int

Using Simple Addressing Modes swap: pushl %ebp movl %esp, %ebp pushl %ebx void swap(int *xp, int *yp) { int t 0 = *xp; movl 12(%ebp), %ecx int t 1 = *yp; movl 8(%ebp), %edx *xp = t 1; movl (%ecx), %eax *yp = t 0; movl (%edx), %ebx } movl %eax, (%edx) movl %ebx, (%ecx) movl -4(%ebp), %ebx movl %ebp, %esp popl %ebp ret Set Up Body Finish

Using Simple Addressing Modes swap: pushl %ebp movl %esp, %ebp pushl %ebx void swap(int

Using Simple Addressing Modes swap: pushl %ebp movl %esp, %ebp pushl %ebx void swap(int *xp, int *yp) { int t 0 = *xp; movl 12(%ebp), %ecx int t 1 = *yp; movl 8(%ebp), %edx *xp = t 1; movl (%ecx), %eax *yp = t 0; movl (%edx), %ebx } movl %eax, (%edx) movl %ebx, (%ecx) movl -4(%ebp), %ebx movl %ebp, %esp popl %ebp ret Set Up Body Finish

Understanding Swap void swap(int *xp, int *yp) { int t 0 = *xp; int

Understanding Swap void swap(int *xp, int *yp) { int t 0 = *xp; int t 1 = *yp; *xp = t 1; *yp = t 0; } Offset • • • 12 yp 8 xp 4 Rtn adr 0 Old %ebp Register %ecx %edx %eax %ebx Value yp xp t 1 t 0 Stack (in memory) %ebp -4 Old %ebx movl 12(%ebp), %ecx movl 8(%ebp), %edx movl (%ecx), %eax movl (%edx), %ebx movl %eax, (%edx) movl %ebx, (%ecx) # ecx = yp # edx = xp # eax = *yp (t 1) # ebx = *xp (t 0) # *xp = eax # *yp = ebx

Understanding Swap 123 Address 0 x 124 456 0 x 120 0 x 11

Understanding Swap 123 Address 0 x 124 456 0 x 120 0 x 11 c %eax Offset %edx %ecx %ebx %esi %ebp 12 0 x 120 0 x 110 xp 8 0 x 124 0 x 10 c 4 Rtn adr 0 x 108 0 0 x 104 -4 %esp 0 x 104 0 x 114 yp %ebp %edi 0 x 118 movl 12(%ebp), %ecx movl 8(%ebp), %edx movl (%ecx), %eax movl (%edx), %ebx movl %eax, (%edx) movl %ebx, (%ecx) # ecx = yp # edx = xp # eax = *yp (t 1) # ebx = *xp (t 0) # *xp = eax # *yp = ebx 0 x 100

Understanding Swap 123 Address 0 x 124 456 0 x 120 0 x 11

Understanding Swap 123 Address 0 x 124 456 0 x 120 0 x 11 c %eax Offset %edx %ecx 0 x 120 %ebx %esi %ebp 12 0 x 120 0 x 110 xp 8 0 x 124 0 x 10 c 4 Rtn adr 0 x 108 0 0 x 104 -4 %esp 0 x 104 0 x 114 yp %ebp %edi 0 x 118 movl 12(%ebp), %ecx movl 8(%ebp), %edx movl (%ecx), %eax movl (%edx), %ebx movl %eax, (%edx) movl %ebx, (%ecx) # ecx = yp # edx = xp # eax = *yp (t 1) # ebx = *xp (t 0) # *xp = eax # *yp = ebx 0 x 100

Understanding Swap 123 Address 0 x 124 456 0 x 120 0 x 11

Understanding Swap 123 Address 0 x 124 456 0 x 120 0 x 11 c %eax %edx 0 x 124 %ecx 0 x 120 %ebx %esi 12 0 x 120 0 x 110 xp 8 0 x 124 0 x 10 c 4 Rtn adr 0 x 108 0 0 x 104 -4 %esp 0 x 104 0 x 114 yp %ebp %edi %ebp Offset 0 x 118 movl 12(%ebp), %ecx movl 8(%ebp), %edx movl (%ecx), %eax movl (%edx), %ebx movl %eax, (%edx) movl %ebx, (%ecx) # ecx = yp # edx = xp # eax = *yp (t 1) # ebx = *xp (t 0) # *xp = eax # *yp = ebx 0 x 100

Understanding Swap 123 Address 0 x 124 456 0 x 120 0 x 11

Understanding Swap 123 Address 0 x 124 456 0 x 120 0 x 11 c %eax 456 %edx 0 x 124 %ecx 0 x 120 %ebx %esi 12 0 x 120 0 x 110 xp 8 0 x 124 0 x 10 c 4 Rtn adr 0 x 108 0 0 x 104 -4 %esp 0 x 104 0 x 114 yp %ebp %edi %ebp Offset 0 x 118 movl 12(%ebp), %ecx movl 8(%ebp), %edx movl (%ecx), %eax movl (%edx), %ebx movl %eax, (%edx) movl %ebx, (%ecx) # ecx = yp # edx = xp # eax = *yp (t 1) # ebx = *xp (t 0) # *xp = eax # *yp = ebx 0 x 100

Understanding Swap 123 Address 0 x 124 456 0 x 120 0 x 11

Understanding Swap 123 Address 0 x 124 456 0 x 120 0 x 11 c %eax 456 %edx 0 x 124 %ecx 0 x 120 %ebx 123 %esi 12 0 x 120 0 x 110 xp 8 0 x 124 0 x 10 c 4 Rtn adr 0 x 108 0 0 x 104 -4 %esp 0 x 104 0 x 114 yp %ebp %edi %ebp Offset 0 x 118 movl 12(%ebp), %ecx movl 8(%ebp), %edx movl (%ecx), %eax movl (%edx), %ebx movl %eax, (%edx) movl %ebx, (%ecx) # ecx = yp # edx = xp # eax = *yp (t 1) # ebx = *xp (t 0) # *xp = eax # *yp = ebx 0 x 100

Understanding Swap 456 Address 0 x 124 456 0 x 120 0 x 11

Understanding Swap 456 Address 0 x 124 456 0 x 120 0 x 11 c %eax 456 %edx 0 x 124 %ecx 0 x 120 %ebx 123 %esi 12 0 x 120 0 x 110 xp 8 0 x 124 0 x 10 c 4 Rtn adr 0 x 108 0 0 x 104 -4 %esp 0 x 104 0 x 114 yp %ebp %edi %ebp Offset 0 x 118 movl 12(%ebp), %ecx movl 8(%ebp), %edx movl (%ecx), %eax movl (%edx), %ebx movl %eax, (%edx) movl %ebx, (%ecx) # ecx = yp # edx = xp # eax = *yp (t 1) # ebx = *xp (t 0) # *xp = eax # *yp = ebx 0 x 100

Understanding Swap 456 Address 0 x 124 123 0 x 120 0 x 11

Understanding Swap 456 Address 0 x 124 123 0 x 120 0 x 11 c %eax 456 %edx 0 x 124 %ecx 0 x 120 %ebx 123 %esi 12 0 x 120 0 x 110 xp 8 0 x 124 0 x 10 c 4 Rtn adr 0 x 108 0 0 x 104 -4 %esp 0 x 104 0 x 114 yp %ebp %edi %ebp Offset 0 x 118 movl 12(%ebp), %ecx movl 8(%ebp), %edx movl (%ecx), %eax movl (%edx), %ebx movl %eax, (%edx) movl %ebx, (%ecx) # ecx = yp # edx = xp # eax = *yp (t 1) # ebx = *xp (t 0) # *xp = eax # *yp = ebx 0 x 100

Complete Memory Addressing Modes ¢ Most General Form D(Rb, Ri, S) Mem[Reg[Rb]+S*Reg[Ri]+ D] §

Complete Memory Addressing Modes ¢ Most General Form D(Rb, Ri, S) Mem[Reg[Rb]+S*Reg[Ri]+ D] § D: § Rb: § Ri: Constant “displacement” 1, 2, or 4 bytes Base register: Any of 8 integer registers Index register: Any, except for %esp § Unlikely you’d use %ebp, either § S: Scale: 1, 2, 4, or 8 (why these numbers? ) ¢ Special Cases (Rb, Ri) D(Rb, Ri) (Rb, Ri, S) Mem[Reg[Rb]+Reg[Ri]] Mem[Reg[Rb]+Reg[Ri]+D] Mem[Reg[Rb]+S*Reg[Ri]]

Machine Programming – Control structures CENG 331: Introduction to Computer Systems Instructor: Erol Sahin

Machine Programming – Control structures CENG 331: Introduction to Computer Systems Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted from the ones prepared by R. E. Bryant, D. R. O’Hallaron of Carnegie-Mellon Univ.

In this lecture ¢ ¢ ¢ Complete addressing mode, address computation (leal) Arithmetic operations

In this lecture ¢ ¢ ¢ Complete addressing mode, address computation (leal) Arithmetic operations x 86 -64 Control: Condition codes Conditional branches While loops

Complete Memory Addressing Modes ¢ Most General Form D(Rb, Ri, S) Mem[Reg[Rb]+S*Reg[Ri]+ D] §

Complete Memory Addressing Modes ¢ Most General Form D(Rb, Ri, S) Mem[Reg[Rb]+S*Reg[Ri]+ D] § D: § Rb: § Ri: Constant “displacement” 1, 2, or 4 bytes Base register: Any of 8 integer registers Index register: Any, except for %esp § Unlikely you’d use %ebp, either § S: Scale: 1, 2, 4, or 8 (why these numbers? ) ¢ Special Cases (Rb, Ri) D(Rb, Ri) (Rb, Ri, S) Mem[Reg[Rb]+Reg[Ri]] Mem[Reg[Rb]+Reg[Ri]+D] Mem[Reg[Rb]+S*Reg[Ri]]

Address Computation Examples %edx 0 xf 000 %ecx 0 x 100 Expression Address Computation

Address Computation Examples %edx 0 xf 000 %ecx 0 x 100 Expression Address Computation Address 0 x 8(%edx) 0 xf 000 + 0 x 8 0 xf 008 (%edx, %ecx) 0 xf 000 + 0 x 100 0 xf 100 (%edx, %ecx, 4) will disappear 0 xf 000 + 4*0 x 100 blackboard? 0 xf 400 0 x 80(, %edx, 2) 2*0 xf 000 + 0 x 80 0 x 1 e 080

Address Computation Examples %edx 0 xf 000 %ecx 0 x 100 Expression Address Computation

Address Computation Examples %edx 0 xf 000 %ecx 0 x 100 Expression Address Computation Address 0 x 8(%edx) 0 xf 000 + 0 x 8 0 xf 008 (%edx, %ecx) 0 xf 000 + 0 x 100 0 xf 100 (%edx, %ecx, 4) 0 xf 000 + 4*0 x 100 0 xf 400 0 x 80(, %edx, 2) 2*0 xf 000 + 0 x 80 0 x 1 e 080

Address Computation Instruction ¢ ¢ leal Src, Dest § Src is address mode expression

Address Computation Instruction ¢ ¢ leal Src, Dest § Src is address mode expression § Set Dest to address denoted by expression Uses § Computing addresses without a memory reference § E. g. , translation of p = &x[i]; § Computing arithmetic expressions of the form x + k*y § ¢ k = 1, 2, 4, or 8 Example

Today ¢ ¢ ¢ Complete addressing mode, address computation (leal) Arithmetic operations x 86

Today ¢ ¢ ¢ Complete addressing mode, address computation (leal) Arithmetic operations x 86 -64 Control: Condition codes Conditional branches While loops

Some Arithmetic Operations ¢ ¢ Two Operand Instructions: Format addl Src, Dest Computation Dest

Some Arithmetic Operations ¢ ¢ Two Operand Instructions: Format addl Src, Dest Computation Dest = Dest + Src subl Src, Dest = Dest - Src imull Src, Dest = Dest * Src sall Src, Dest = Dest << Src Also called shll sarl Src, Dest = Dest >> Src Arithmetic shrl Src, Dest = Dest >> Src Logical xorl Src, Dest = Dest ^ Src andl Src, Dest = Dest & Src orl Src, Dest = Dest | Src No distinction between signed and unsigned int (why? )

Some Arithmetic Operations ¢ ¢ One Operand Instructions incl Dest = Dest + 1

Some Arithmetic Operations ¢ ¢ One Operand Instructions incl Dest = Dest + 1 decl Dest = Dest - 1 negl Dest = -Dest notl Dest = ~Dest See book for more instructions

Using leal for Arithmetic Expressions int arith (int x, int y, int z) {

Using leal for Arithmetic Expressions int arith (int x, int y, int z) { int t 1 = x+y; int t 2 = z+t 1; int t 3 = x+4; int t 4 = y * 48; int t 5 = t 3 + t 4; int rval = t 2 * t 5; return rval; } arith: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax movl 12(%ebp), %edx leal (%edx, %eax), %ecx leal (%edx, 2), %edx sall $4, %edx addl 16(%ebp), %ecx leal 4(%edx, %eax), %eax imull %ecx, %eax movl %ebp, %esp popl %ebp ret Set Up Body Finish

Understanding arith int arith (int x, int y, int z) { int t 1

Understanding arith int arith (int x, int y, int z) { int t 1 = x+y; int t 2 = z+t 1; int t 3 = x+4; int t 4 = y * 48; int t 5 = t 3 + t 4; int rval = t 2 * t 5; return rval; } movl 8(%ebp), %eax movl 12(%ebp), %edx leal (%edx, %eax), %ecx leal (%edx, 2), %edx sall $4, %edx addl 16(%ebp), %ecx leal 4(%edx, %eax), %eax imull %ecx, %eax • • • Offset 16 z 12 y 8 x Stack 4 Rtn adr 0 Old %ebp # eax = x # edx = y # ecx = x+y (t 1) will disappear # edx = 3*y # edx = 48*y (t 4) blackboard? # ecx = z+t 1 (t 2) # eax = 4+t 4+x (t 5) # eax = t 5*t 2 (rval) %ebp

Understanding arith int arith (int x, int y, int z) { int t 1

Understanding arith int arith (int x, int y, int z) { int t 1 = x+y; int t 2 = z+t 1; int t 3 = x+4; int t 4 = y * 48; int t 5 = t 3 + t 4; int rval = t 2 * t 5; return rval; } movl 8(%ebp), %eax movl 12(%ebp), %edx leal (%edx, %eax), %ecx leal (%edx, 2), %edx sall $4, %edx addl 16(%ebp), %ecx leal 4(%edx, %eax), %eax imull %ecx, %eax • • • Offset 16 z 12 y 8 x Stack 4 Rtn adr 0 Old %ebp # eax = x # edx = y # ecx = x+y (t 1) # edx = 3*y # edx = 48*y (t 4) # ecx = z+t 1 (t 2) # eax = 4+t 4+x (t 5) # eax = t 5*t 2 (rval) %ebp

Understanding arith int arith (int x, int y, int z) { int t 1

Understanding arith int arith (int x, int y, int z) { int t 1 = x+y; int t 2 = z+t 1; int t 3 = x+4; int t 4 = y * 48; int t 5 = t 3 + t 4; int rval = t 2 * t 5; return rval; } movl 8(%ebp), %eax movl 12(%ebp), %edx leal (%edx, %eax), %ecx leal (%edx, 2), %edx sall $4, %edx addl 16(%ebp), %ecx leal 4(%edx, %eax), %eax imull %ecx, %eax • • • Offset 16 z 12 y 8 x Stack 4 Rtn adr 0 Old %ebp # eax = x # edx = y # ecx = x+y (t 1) # edx = 3*y # edx = 48*y (t 4) # ecx = z+t 1 (t 2) # eax = 4+t 4+x (t 5) # eax = t 5*t 2 (rval) %ebp

Understanding arith int arith (int x, int y, int z) { int t 1

Understanding arith int arith (int x, int y, int z) { int t 1 = x+y; int t 2 = z+t 1; int t 3 = x+4; int t 4 = y * 48; int t 5 = t 3 + t 4; int rval = t 2 * t 5; return rval; } movl 8(%ebp), %eax movl 12(%ebp), %edx leal (%edx, %eax), %ecx leal (%edx, 2), %edx sall $4, %edx addl 16(%ebp), %ecx leal 4(%edx, %eax), %eax imull %ecx, %eax • • • Offset 16 z 12 y 8 x Stack 4 Rtn adr 0 Old %ebp # eax = x # edx = y # ecx = x+y (t 1) # edx = 3*y # edx = 48*y (t 4) # ecx = z+t 1 (t 2) # eax = 4+t 4+x (t 5) # eax = t 5*t 2 (rval) %ebp

Understanding arith int arith (int x, int y, int z) { int t 1

Understanding arith int arith (int x, int y, int z) { int t 1 = x+y; int t 2 = z+t 1; int t 3 = x+4; int t 4 = y * 48; int t 5 = t 3 + t 4; int rval = t 2 * t 5; return rval; } movl 8(%ebp), %eax movl 12(%ebp), %edx leal (%edx, %eax), %ecx leal (%edx, 2), %edx sall $4, %edx addl 16(%ebp), %ecx leal 4(%edx, %eax), %eax imull %ecx, %eax • • • Offset 16 z 12 y 8 x Stack 4 Rtn adr 0 Old %ebp # eax = x # edx = y # ecx = x+y (t 1) # edx = 3*y # edx = 48*y (t 4) # ecx = z+t 1 (t 2) # eax = 4+t 4+x (t 5) # eax = t 5*t 2 (rval) %ebp

Another Example int logical(int x, int y) { int t 1 = x^y; int

Another Example int logical(int x, int y) { int t 1 = x^y; int t 2 = t 1 >> 17; int mask = (1<<13) - 7; int rval = t 2 & mask; return rval; } movl 8(%ebp), %eax xorl 12(%ebp), %eax sarl $17, %eax andl $8185, %eax logical: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax xorl 12(%ebp), %eax sarl $17, %eax andl $8185, %eax movl %ebp, %esp popl %ebp ret # eax = x^y # eax = t 1>>17 # eax = t 2 & 8185 Set Up Body Finish

Another Example int logical(int x, int y) { int t 1 = x^y; int

Another Example int logical(int x, int y) { int t 1 = x^y; int t 2 = t 1 >> 17; int mask = (1<<13) - 7; int rval = t 2 & mask; return rval; } movl 8(%ebp), %eax xorl 12(%ebp), %eax sarl $17, %eax andl $8185, %eax logical: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax xorl 12(%ebp), %eax sarl $17, %eax andl $8185, %eax movl %ebp, %esp popl %ebp ret eax = x^y (t 1) eax = t 1>>17 (t 2) eax = t 2 & 8185 Set Up Body Finish

Another Example int logical(int x, int y) { int t 1 = x^y; int

Another Example int logical(int x, int y) { int t 1 = x^y; int t 2 = t 1 >> 17; int mask = (1<<13) - 7; int rval = t 2 & mask; return rval; } movl 8(%ebp), %eax xorl 12(%ebp), %eax sarl $17, %eax andl $8185, %eax logical: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax xorl 12(%ebp), %eax sarl $17, %eax andl $8185, %eax movl %ebp, %esp popl %ebp ret eax = x^y (t 1) eax = t 1>>17 (t 2) eax = t 2 & 8185 Set Up Body Finish

Another Example int logical(int x, int y) { int t 1 = x^y; int

Another Example int logical(int x, int y) { int t 1 = x^y; int t 2 = t 1 >> 17; int mask = (1<<13) - 7; int rval = t 2 & mask; return rval; } 213 = 8192, 213 – 7 = 8185 movl 8(%ebp), %eax xorl 12(%ebp), %eax sarl $17, %eax andl $8185, %eax logical: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax xorl 12(%ebp), %eax sarl $17, %eax andl $8185, %eax movl %ebp, %esp popl %ebp ret eax = x^y (t 1) eax = t 1>>17 (t 2) eax = t 2 & 8185 Set Up Body Finish

Machine Programming – Condition Codes and Branching CENG 331: Introduction to Computer Systems Instructor:

Machine Programming – Condition Codes and Branching CENG 331: Introduction to Computer Systems Instructor: Erol Sahin Acknowledgement: Most of the slides are adapted from the ones prepared by R. E. Bryant, D. R. O’Hallaron of Carnegie-Mellon Univ.

Processor State (IA 32, Partial) ¢ Information about currently executing program § Temporary data

Processor State (IA 32, Partial) ¢ Information about currently executing program § Temporary data ( %eax, … ) § Location of runtime stack ( %ebp, %esp ) § Location of current code control point ( %eip, … ) § Status of recent tests %eax %ecx %edx %ebx %esi General purpose registers %edi %esp %ebp Current stack top %eip Instruction pointer CF ZF SF OF Condition codes ( CF, ZF, SF, OF ) Current stack frame

Condition Codes (Implicit Setting) ¢ Single bit registers CF Carry Flag (for unsigned) SF

Condition Codes (Implicit Setting) ¢ Single bit registers CF Carry Flag (for unsigned) SF Sign Flag (for signed) ZF Zero Flag OF Overflow Flag (for signed) ¢ Implicitly set (think of it as side effect) by arithmetic operations § § Example: addl/addq Src, Dest ↔ t = a+b CF set if carry out from most significant bit (unsigned overflow) ZF set if t == 0 SF set if t < 0 (as signed) OF set if two’s complement (signed) overflow (a>0 && b>0 && t<0) || (a<0 && b<0 && t>=0) Not set by lea instruction ¢ Full documentation (IA 32), link also on course website ¢

Condition Codes (Explicit Setting: Compare) ¢ Explicit Setting by Compare Instruction cmpl/cmpq Src 2,

Condition Codes (Explicit Setting: Compare) ¢ Explicit Setting by Compare Instruction cmpl/cmpq Src 2, Src 1 cmpl b, a like computing a-b without setting destination § § CF set if carry out from most significant bit (used for unsigned comparisons) ZF set if a == b SF set if (a-b) < 0 (as signed) OF set if two’s complement (signed) overflow (a>0 && b<0 && (a-b)<0) || (a<0 && b>0 && (a-b)>0)

Condition Codes (Explicit Setting: Test) ¢ Explicit Setting by Test instruction testl/testq Src 2,

Condition Codes (Explicit Setting: Test) ¢ Explicit Setting by Test instruction testl/testq Src 2, Src 1 testl b, a like computing a&b without setting destination § Sets condition codes based on value of Src 1 & Src 2 § Useful to have one of the operands be a mask § ZF set when a&b == 0 § SF set when a&b < 0

Reading Condition Codes ¢ Set. X Instructions § Set single byte based on combinations

Reading Condition Codes ¢ Set. X Instructions § Set single byte based on combinations of condition codes § e. g. sete %al Set. X sete setne sets setns setge setle seta setb Condition ZF ~ZF SF ~(SF^OF)&~ZF ~(SF^OF)|ZF ~CF&~ZF CF Description Equal / Zero Not Equal / Not Zero Negative Nonnegative Greater (Signed) Greater or Equal (Signed) Less or Equal (Signed) Above (unsigned) Below (unsigned)

Reading Condition Codes ¢ ¢ Less (Signed) Setl = (SF^OF) OF 0 0 1

Reading Condition Codes ¢ ¢ Less (Signed) Setl = (SF^OF) OF 0 0 1 1 SF 0 1 SF^OF 0 1 1 0 No overflow, sign bit is correct Overflow, sign bit is reversed SF set if (a-b) < 0 (as signed) OF set if two’s complement (signed) overflow (a>0 && b<0 && (a-b)<0) || (a<0 && b>0 && (a-b)>0)

Reading Condition Codes (Cont. ) ¢ Set. X Instructions: %eax %ah %al %ecx %ch

Reading Condition Codes (Cont. ) ¢ Set. X Instructions: %eax %ah %al %ecx %ch %cl One of 8 addressable byte registers %edx %dh %dl § Does not alter remaining 3 bytes § Typically use movzbl to finish job %ebx %bh %bl Set single byte based on combination of condition codes ¢ int gt (int x, int y) { return x > y; } %esi %edi %esp %ebp Body movl 12(%ebp), %eax cmpl %eax, 8(%ebp) setg %al movzbl %al, %eax # eax = y # Compare x and y # al = x > y # Zero rest of %eax Note inverted ordering!

Jumping ¢ j. X Instructions § Jump to different part of code depending on

Jumping ¢ j. X Instructions § Jump to different part of code depending on condition codes j. X Arg. Condition Description jmp Label 1 Direct jump jmp *Operand 1 Indirect jump je Label ZF Equal / Zero jne Label ~ZF Not Equal / Not Zero js Label SF Negative jns Label ~SF Nonnegative jg Label ~(SF^OF)&~ZF Greater (Signed) jge Label ~(SF^OF) Greater or Equal (Signed) jl Label (SF^OF) Less (Signed) jle Label (SF^OF)|ZF Less or Equal (Signed) ja Label ~CF&~ZF Above (unsigned) jb Label CF Below (unsigned)

Direct and indirect jumps ¢ jmp Label § Label is the address of the

Direct and indirect jumps ¢ jmp Label § Label is the address of the instruction to be executed next ¢ jmp *Operand § jmp *%eax § Use the value in %eax as the jump address § jmp *(%eax) § Read the jump target from memory using the value in %eax as the address

Conditional Branch Example int absdiff(int x, int y) { int result; if (x >

Conditional Branch Example int absdiff(int x, int y) { int result; if (x > y) { result = x-y; } else { result = y-x; } return result; } absdiff: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl 12(%ebp), %eax cmpl %eax, %edx jle . L 7 subl %eax, %edx movl %edx, %eax. L 8: leave ret. L 7: subl %edx, %eax jmp . L 8 Setup Body 1 Finish Body 2

Conditional Branch Example (Cont. ) int goto_ad(int x, int y) { int result; if

Conditional Branch Example (Cont. ) int goto_ad(int x, int y) { int result; if (x <= y) goto Else; result = x-y; Exit: return result; Else: result = y-x; goto Exit; } ¢ ¢ C allows “goto” as means of transferring control § Closer to machine-level programming style Generally considered bad coding style absdiff: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl 12(%ebp), %eax cmpl %eax, %edx jle . L 7 subl %eax, %edx movl %edx, %eax. L 8: leave ret. L 7: subl %edx, %eax jmp . L 8

Conditional Branch Example (Cont. ) int goto_ad(int x, int y) { int result; if

Conditional Branch Example (Cont. ) int goto_ad(int x, int y) { int result; if (x <= y) goto Else; result = x-y; Exit: return result; Else: result = y-x; goto Exit; } absdiff: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl 12(%ebp), %eax cmpl %eax, %edx jle . L 7 subl %eax, %edx movl %edx, %eax. L 8: leave ret. L 7: subl %edx, %eax jmp . L 8

Conditional Branch Example (Cont. ) int goto_ad(int x, int y) { int result; if

Conditional Branch Example (Cont. ) int goto_ad(int x, int y) { int result; if (x <= y) goto Else; result = x-y; Exit: return result; Else: result = y-x; goto Exit; } absdiff: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl 12(%ebp), %eax cmpl %eax, %edx jle . L 7 subl %eax, %edx movl %edx, %eax. L 8: leave ret. L 7: subl %edx, %eax jmp . L 8

Conditional Branch Example (Cont. ) int goto_ad(int x, int y) { int result; if

Conditional Branch Example (Cont. ) int goto_ad(int x, int y) { int result; if (x <= y) goto Else; result = x-y; Exit: return result; Else: result = y-x; goto Exit; } absdiff: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl 12(%ebp), %eax cmpl %eax, %edx jle . L 7 subl %eax, %edx movl %edx, %eax. L 8: leave ret. L 7: subl %edx, %eax jmp . L 8

Conditional Branch Example (Cont. ) int goto_ad(int x, int y) { int result; if

Conditional Branch Example (Cont. ) int goto_ad(int x, int y) { int result; if (x <= y) goto Else; result = x-y; Exit: return result; Else: result = y-x; goto Exit; } absdiff: pushl %ebp movl %esp, %ebp movl 8(%ebp), %edx movl 12(%ebp), %eax cmpl %eax, %edx jle . L 7 subl %eax, %edx movl %edx, %eax. L 8: leave ret. L 7: subl %edx, %eax jmp . L 8

General Conditional Expression Translation C Code val = Test ? Then-Expr : Else-Expr; val

General Conditional Expression Translation C Code val = Test ? Then-Expr : Else-Expr; val = x>y ? x-y : y-x; Goto Version nt = !Test; if (nt) goto Else; val = Then-Expr; Done: . . . Else: val = Else-Expr; goto Done; § Test is expression returning integer = 0 interpreted as false 0 interpreted as true § Create separate code regions for then & else expressions § Execute appropriate one

“Do-While” Loop Example C Code int fact_do(int x) { int result = 1; do

“Do-While” Loop Example C Code int fact_do(int x) { int result = 1; do { result *= x; x = x-1; } while (x > 1); return result; } ¢ ¢ Goto Version int fact_goto(int x) { int result = 1; loop: result *= x; x = x-1; if (x > 1) goto loop; return result; } Use backward branch to continue looping Only take branch when “while” condition holds

“Do-While” Loop Compilation Goto Version Assembly Registers: %edx %eax x result int fact_goto(int x)

“Do-While” Loop Compilation Goto Version Assembly Registers: %edx %eax x result int fact_goto(int x) { int result = 1; fact_goto: pushl %ebp movl %esp, %ebp movl $1, %eax movl 8(%ebp), %edx # Setup # eax = 1 # edx = x loop: result *= x; x = x-1; if (x > 1) goto loop; . L 11: imull %edx, %eax decl %edx cmpl $1, %edx jg. L 11 # result *= x # x-# Compare x : 1 # if > goto loop return result; } movl %ebp, %esp popl %ebp ret # Finish

General “Do-While” Translation C Code do Body while (Test); ¢ ¢ Body: { Statement

General “Do-While” Translation C Code do Body while (Test); ¢ ¢ Body: { Statement 1; Statement 2; … Statementn; } Test returns integer = 0 interpreted as false 0 interpreted as true Goto Version loop: Body if (Test) goto loop

“While” Loop Example C Code int fact_while(int x) { int result = 1; while

“While” Loop Example C Code int fact_while(int x) { int result = 1; while (x > 1) { result *= x; x = x-1; }; return result; } ¢ ¢ Goto Version #1 int fact_while_goto(int x) { int result = 1; loop: if (!(x > 1)) goto done; result *= x; x = x-1; goto loop; done: return result; } Is this code equivalent to the do-while version? Must jump out of loop if test fails

Alternative “While” Loop Translation C Code int fact_while(int x) { int result = 1;

Alternative “While” Loop Translation C Code int fact_while(int x) { int result = 1; while (x > 1) { result *= x; x = x-1; }; return result; } ¢ ¢ ¢ Historically used by GCC Uses same inner loop as dowhile version Guards loop entry with extra test Goto Version #2 int fact_while_goto 2(int x) { int result = 1; if (!(x > 1)) goto done; loop: result *= x; x = x-1; if (x > 1) goto loop; done: return result; }

General “While” Translation While version while (Test) Body Do-While Version if (!Test) goto done;

General “While” Translation While version while (Test) Body Do-While Version if (!Test) goto done; do Body while(Test); done: Goto Version if (!Test) goto done; loop: Body if (Test) goto loop; done:

New Style “While” Loop Translation C Code int fact_while(int x) { int result =

New Style “While” Loop Translation C Code int fact_while(int x) { int result = 1; while (x > 1) { result *= x; x = x-1; }; return result; } Recent technique for GCC § Both IA 32 & x 86 -64 ¢ First iteration jumps over body computation within loop ¢ Goto Version int fact_while_goto 3(int x) { int result = 1; goto middle; loop: result *= x; x = x-1; middle: if (x > 1) goto loop; return result; }

Jump-to-Middle While Translation C Code while (Test) Body ¢ ¢ ¢ Goto Version goto

Jump-to-Middle While Translation C Code while (Test) Body ¢ ¢ ¢ Goto Version goto middle; loop: Body middle: if (Test) goto loop; Avoids duplicating test code Unconditional goto incurs no performance penalty for loops compiled in similar fashion Goto (Previous) Version if (!Test) goto done; loop: Body if (Test) goto loop; done:

Jump-to-Middle Example int fact_while(int x) { int result = 1; while (x > 1)

Jump-to-Middle Example int fact_while(int x) { int result = 1; while (x > 1) { result *= x; x--; }; return result; } # x in %edx, result in %eax jmp . L 34 # goto Middle. L 35: # Loop: imull %edx, %eax # result *= x decl %edx # x-. L 34: # Middle: cmpl $1, %edx # x: 1 jg . L 35 # if >, goto Loop

“For” “While” “Do-While” For Version for (Init; Test; Update ) Body Goto Version Init;

“For” “While” “Do-While” For Version for (Init; Test; Update ) Body Goto Version Init; if (!Test) goto done; loop: Body Update ; if (Test) goto loop; done: While Version Init; while (Test ) { Body Update ; } Do-While Version Init; if (!Test) goto done; do { Body Update ; } while (Test) done:

“For” “While” (Jump-to-Middle) For Version for (Init; Test; Update ) Body While Version Init;

“For” “While” (Jump-to-Middle) For Version for (Init; Test; Update ) Body While Version Init; while (Test ) { Body Update ; } Goto Version Init; goto middle; loop: Body Update ; middle: if (Test) goto loop; done:

Implementing Loops ¢ IA 32 § All loops translated into form based on “do-while”

Implementing Loops ¢ IA 32 § All loops translated into form based on “do-while” ¢ x 86 -64 § Also make use of “jump to middle” ¢ Why the difference § IA 32 compiler developed for machine where all operations costly § x 86 -64 compiler developed for machine where unconditional branches incur (almost) no overhead

long switch_eg (long x, long y, long z) { long w = 1; switch(x)

long switch_eg (long x, long y, long z) { long w = 1; switch(x) { case 1: w = y*z; break; case 2: w = y/z; /* Fall Through */ case 3: w += z; break; case 5: case 6: w -= z; break; default: w = 2; } return w; } Switch Statement Example ¢ Multiple case labels § Here: 5, 6 ¢ Fall through cases § Here: 2 ¢ Missing cases § Here: 4

Jump Table Structure switch(x) { case val_0: Block 0 case val_1: Block 1 •

Jump Table Structure switch(x) { case val_0: Block 0 case val_1: Block 1 • • • case val_n-1: Block n– 1 } Jump Targets Jump Table Switch Form jtab: Targ 0: Code Block 0 Targ 1: Code Block 1 Targ 2: Code Block 2 Targ 1 Targ 2 • • • Targn-1 • • • Approximate Translation target = JTab[x]; goto *target; Targn-1: Code Block n– 1

Switch Statement Example (IA 32) long switch_eg(long x, long y, long z) { long

Switch Statement Example (IA 32) long switch_eg(long x, long y, long z) { long w = 1; switch(x) { . . . } return w; } Setup: Indirect jump switch_eg: pushl %ebp movl %esp, %ebp pushl %ebx movl $1, %ebx movl 8(%ebp), %edx movl 16(%ebp), %ecx cmpl $6, %edx ja . L 61 jmp *. L 62(, %edx, 4) Jump table. section. rodata . align 4. L 62: . long . L 61 # x = 0. long . L 56 # x = 1. long . L 57 # x = 2. long . L 58 # x = 3. long . L 61 # x = 4. long . L 60 # x = 5. long . L 60 # x = 6 # Setup # w = 1 # edx = x # ecx = z # x: 6 # if > goto default # goto JTab[x]

Assembly Setup Explanation ¢ Table Structure § Each target requires 4 bytes § Base

Assembly Setup Explanation ¢ Table Structure § Each target requires 4 bytes § Base address at. L 62 ¢ Jumping Direct: jmp. L 61 § Jump target is denoted by label. L 61 Jump table. section. rodata . align 4. L 62: . long . L 61 # x = 0. long . L 56 # x = 1. long . L 57 # x = 2. long . L 58 # x = 3. long . L 61 # x = 4. long . L 60 # x = 5. long . L 60 # x = 6 Indirect: jmp *. L 62(, %edx, 4) § Start of jump table: . L 62 § Must scale by factor of 4 (labels have 32 -bit = 4 Bytes on IA 32) § Fetch target from effective Address. L 61 + edx*4 § Only for 0 x 6

Jump Table Jump table. section. rodata . align 4. L 62: . long .

Jump Table Jump table. section. rodata . align 4. L 62: . long . L 61 # x = 0. long . L 56 # x = 1. long . L 57 # x = 2. long . L 58 # x = 3. long . L 61 # x = 4. long . L 60 # x = 5. long . L 60 # x = 6 switch(x) { case 1: //. L 56 w = y*z; break; case 2: //. L 57 w = y/z; /* Fall Through */ case 3: //. L 58 w += z; break; case 5: case 6: //. L 60 w -= z; break; default: //. L 61 w = 2; }

Code Blocks (Partial) switch(x) { . . . case 2: //. L 57 w

Code Blocks (Partial) switch(x) { . . . case 2: //. L 57 w = y/z; /* Fall Through */ case 3: //. L 58 w += z; break; . . . default: //. L 61 w = 2; } . L 61: // Default case movl $2, %ebx # w = 2 movl %ebx, %eax # Return w popl %ebx leave ret. L 57: // Case 2: movl 12(%ebp), %eax # y cltd # Div prep idivl %ecx # y/z movl %eax, %ebx # w = y/z # Fall through. L 58: // Case 3: addl %ecx, %ebx # w+= z movl %ebx, %eax # Return w popl %ebx leave ret

IA 32 Object Code ¢ Setup § Label. L 61 becomes address 0 x

IA 32 Object Code ¢ Setup § Label. L 61 becomes address 0 x 8048630 § Label. L 62 becomes address 0 x 80488 dc Assembly Code switch_eg: . . . ja . L 61 # if > goto default jmp *. L 62(, %edx, 4) # goto JTab[x] Disassembled Object Code 08048610 <switch_eg>: . . . 8048622: 77 0 c ja 8048630 8048624: ff 24 95 dc 88 04 08 jmp *0 x 80488 dc(, %edx, 4)

IA 32 Object Code (cont. ) ¢ Jump Table § Doesn’t show up in

IA 32 Object Code (cont. ) ¢ Jump Table § Doesn’t show up in disassembled code § Can inspect using GDB gdb asm-cntl (gdb) x/7 xw 0 x 80488 dc § Examine 7 hexadecimal format “words” (4 -bytes each) § Use command “help x” to get format documentation 0 x 80488 dc: 0 x 08048630 0 x 08048650 0 x 0804863 a 0 x 08048642 0 x 08048630 0 x 08048649

Disassembled Targets 8048630: bb 02 00 00 00 mov $0 x 2, %ebx 8048635:

Disassembled Targets 8048630: bb 02 00 00 00 mov $0 x 2, %ebx 8048635: 89 d 8 mov %ebx, %eax 8048637: 5 b pop %ebx 8048638: c 9 leave 8048639: c 3 ret 804863 a: 8 b 45 0 c mov 0 xc(%ebp), %eax 804863 d: 99 cltd 804863 e: f 7 f 9 idiv %ecx 8048640: 89 c 3 mov %eax, %ebx 8048642: 01 cb add %ecx, %ebx 8048644: 89 d 8 mov %ebx, %eax 8048646: 5 b pop %ebx 8048647: c 9 leave 8048648: c 3 ret 8048649: 29 cb sub %ecx, %ebx 804864 b: 89 d 8 mov %ebx, %eax 804864 d: 5 b pop %ebx 804864 e: c 9 leave 804864 f: c 3 ret 8048650: 8 b 5 d 0 c mov 0 xc(%ebp), %ebx 8048653: 0 f af d 9 imul %ecx, %ebx 8048656: 89 d 8 mov %ebx, %eax 8048658: 5 b pop %ebx 8048659: c 9 leave 804865 a: c 3 ret

Matching Disassembled Targets 0 x 08048630 0 x 08048650 0 x 0804863 a 0

Matching Disassembled Targets 0 x 08048630 0 x 08048650 0 x 0804863 a 0 x 08048642 0 x 08048630 0 x 08048649 8048630: bb 02 00 00 00 mov 8048635: 89 d 8 mov 8048637: 5 b pop 8048638: c 9 leave 8048639: c 3 ret 804863 a: 8 b 45 0 c mov 804863 d: 99 cltd 804863 e: f 7 f 9 idiv 8048640: 89 c 3 mov 8048642: 01 cb add 8048644: 89 d 8 mov 8048646: 5 b pop 8048647: c 9 leave 8048648: c 3 ret 8048649: 29 cb sub 804864 b: 89 d 8 mov 804864 d: 5 b pop 804864 e: c 9 leave 804864 f: c 3 ret 8048650: 8 b 5 d 0 c mov 8048653: 0 f af d 9 imul 8048656: 89 d 8 mov 8048658: 5 b pop 8048659: c 9 leave 804865 a: c 3 ret

Sparse Switch Example /* Return x/111 if x is multiple && <= 999. -1

Sparse Switch Example /* Return x/111 if x is multiple && <= 999. -1 otherwise */ int div 111(int x) { switch(x) { case 0: return 0; case 111: return 1; case 222: return 2; case 333: return 3; case 444: return 4; case 555: return 5; case 666: return 6; case 777: return 7; case 888: return 8; case 999: return 9; default: return -1; } } § Not practical to use jump table § Would require 1000 entries § Obvious translation into ifthen-else would have max. of 9 tests

Sparse Switch Code (IA 32) movl 8(%ebp), %eax cmpl $444, %eax je L 8

Sparse Switch Code (IA 32) movl 8(%ebp), %eax cmpl $444, %eax je L 8 jg L 16 cmpl $111, %eax je L 5 jg L 17 testl %eax, %eax je L 4 jmp L 14. . . # get x # x: 444 # x: 111 § Compares x to possible case values § Jumps different places depending on outcomes. . . L 5: movl $1, %eax jmp L 19 # x: 0 L 6: movl $2, %eax jmp L 19 L 7: movl $3, %eax jmp L 19 L 8: movl $4, %eax jmp L 19. . .

Sparse Switch Code Structure < < 0 = -1 0 ¢ ¢ 111 =

Sparse Switch Code Structure < < 0 = -1 0 ¢ ¢ 111 = 1 444 > = 4 > 222 = 2 333 = -1 3 Organizes cases as binary tree Logarithmic performance 777 < = 7 555 = 5 666 = -1 6 > 888 > = 8 999 = -1 9

Summarizing ¢ ¢ C Control § if-then-else § do-while § while, for § switch

Summarizing ¢ ¢ C Control § if-then-else § do-while § while, for § switch Assembler Control § Conditional jump § Indirect jump § Compiler § Must generate assembly code to implement more complex control ¢ Standard Techniques § IA 32 loops converted to do-while form § x 86 -64 loops use jump-to-middle § Large switch statements use jump tables § Sparse switch statements may use decision trees (not shown) ¢ Conditions in CISC § CISC machines generally have condition code registers