MachineLevel Programming II Basics Comp 21000 Introduction to

Machine-Level Programming II: Basics Comp 21000: Introduction to Computer Organization & Systems Instructor: John Barr * Modified slides from the book “Computer Systems: a Programmer’s Perspective”, Randy Bryant & David O’Hallaron, 2015 1

Machine Programming I: Basics ¢ ¢ History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move Intro to x 86 -64 2

Assembly Language ¢ ¢ ¢ Labels Instructions Operands: 0, 1 or 2 § Are either a register or a memory address sumstore: pushq movq call movq popq ret %rbp %rdx, %rbx plus %rax, (%rbx) %rbp 3

IS 32/x 86 -64 Properties ¢ Instruction can reference different operand types § Immediate, register, memory ¢ ¢ Arithmetic operations can read/write memory Memory reference can involve complex computation § Rb + S*Ri + D § Useful for arithmetic expressions, too ¢ Instructions can have varying lengths § IA 32 instructions can range from 1 to 15 bytes 4

Features of IA 32 instructions l X 86 -64 instructions can be from 1 to 15 bytes. § More commonly used instructions are shorter § Instructions with fewer operands are shorter l Each instruction has an instruction format § Each instruction has a unique byte representation § i. e. , instruction pushl %ebp has encoding 55 l X 86 -64 started as a 16 bit language § So IA 32 calls a 16 -bit piece of data a “word” § A 32 -bit piece of data is a “double word” or a “long word” § A 64 -bit piece of data is a “quad word” 5

Assembly Programmer’s View (review) CPU PC Registers Condition Codes ¢ Addresses Data Instructions Memory Object Code Program Data OS Data Stack Programmer-Visible State § PC: Program counter § Address of next instruction § Called “EIP” (IA 32) or “RIP” (x 86 -64) § Register file § Heavily used program data § 16 named locations, 64 bit values (x 86 -64) § Condition codes § Store status information about most recent arithmetic operation § Used for conditional branching § Memory Byte addressable array § Code, user data, (some) OS data § Includes stack used to support procedures § 6

Instruction format ¢ ¢ Assembly language instructions have a very rigid format For most instructions the format is Instruction name movl Source, Dest Instruction suffix Destination of instruction re Registers/memory Source of data for the instruction: Registers/memory Remember that we use AT&T assembly format 7

Instruction format ¢ Assembly format relates to C code: x = y; Instruction name movl Source, Dest Remember that we use AT&T assembly format 8

Data Representations: IA 32 + x 86 -64 ¢ § Sizes of C Objects (in Bytes) C Data Type Generic 32 -bit Intel IA 32 § unsigned 4 4 § int 4 4 § long int 4 4 § char 1 1 § short 2 2 § float 4 4 § double 8 8 § long double 8 10/12 § char * 4 4 – Or any other pointer x 86 -64 4 4 8 1 2 4 8 16 8 9

Instruction suffix Every operation in GAS has a single-character suffix § Denotes the size of the ¢ operand § Example: basic instruction is mov § Can move byte (movb), word (movw), double word (movl), and quad word (movq) ¢Note that floating point operations have entirely different instructions. C declaration Intel data type GAS suffix Size (bytes) char Byte b 1 short Word w 2 int Double word l 4 unsigned Double word l 4 long int Quad word q 8 unsigned long Double word q 8 char * Quad word q 8 float Single precision s 4 double Double precision d 8 long double Extended precision t 16 10

Registers ¢ 16 64 -bit general purpose registers § Programmers/compilers can use these § All registers begin with %r § Rest of name is historical: from 8086 Registers originally had specific purposes § No restrictions on use of registers in commands § However, some instructions use fixed registers as source/destination § In procedures there are different conventions for saving/restoring the first 4 registers (%rax, %rbx, %rcx, %rdx) than the next 4 (%rsi, %rdi, %rsp, %rbp). § Final two registers have special purposes in procedures – %rbp (frame pointer) – %rsp (stack pointer) § Will discuss all these later § 11

Registers ¢ 16 64 -bit general purpose registers § The low-order 4 bytes can be independently read or written by operation instructions. § Done for backward compatibility with 8008 and 8080 (1970’s!) § When a byte of the register is changed, the rest of the register is unaffected. § The low-order 2 bytes (16 bits, i. e. , a single word) can be independently read/wrote by word operation instructions § Comes from 8086 16 -bit heritage § When a word of the register is changed, the rest of the register is unaffected. § To access the lower 32 bits, 16 -bits or byte of registers %r 8 -%r 15, append the letter d, w or b to the end of the register’s name respectively. § See next slide! 12

x 86 -64 Integer Registers %rax %eax %r 8 d %rbx %ebx %r 9 d %rcx %ecx %r 10 d %rdx %edx %r 11 d %rsi %esi %r 12 d %rdi %edi %r 13 d %rsp %esp %r 14 d %rbp %ebp %r 15 d § Can reference low-order 4 bytes (also low-order 1 & 2 bytes) 13

general purpose History: IA 32 Registers 8 -bit register (%ah, %al, ch, …) Origin (mostly obsolete) %eax %ah %al accumulate %ecx %ch %cl counter %edx %dh %dl data %ebx %bh %bl base %esi %si source index %edi %di destination index %esp %ebp %bp 32 -bit register (%eax, %ecx, …) stack pointer base pointer 16 -bit virtual registers (%ax, %cx, dx, …) (backwards compatibility) 14

Moving Data ¢ Moving Data movq Source, Dest § Move 8 -byte (“quad”) word § Lots of these in typical code ¢ Corresponds to C level statement x = y; § x and y represent memory locations § We’re moving the value stored in the memory location identified with “y” to the memory location identified with “x” %rax %rcx %rdx %rbx %rsi %rdi %rsp %rbp %r. N 15

Moving Data ¢ Moving Data movq Source, Dest § Move 8 -byte (“quad”) word § Lots of these in typical code ¢ Operand Types § Immediate: Constant integer data %rax %rcx %rdx %rbx %rsi %rdi %rsp %rbp Example: $0 x 400, $-533 § Like C constant, but prefixed with ‘$’ § Encoded with 1, 2, or 4 bytes § Register: One of 16 integer registers %r. N § Example: %rax, %r 13 § But %rsp reserved for special use § Others have special uses for particular instructions § Memory: 8 consecutive bytes of memory at address given by register § Simplest example: (%rax) § Various other “address modes” § 16

movq Operand Combinations Source movq Dest Src, Dest C Analog Imm Reg movq $0 x 4, %rax Mem movq $-147, (%rax) temp = 0 x 4; Reg movq %rax, %rdx Mem movq %rax, (%rdx) temp 2 = temp 1; Mem Reg temp = *p; movq (%rax), %rdx *p = -147; *p = temp; Cannot do memory-memory transfer with a single instruction 17

Examples sumstore: pushq movq call movq popq ret %rbp %rdx, %rbx plus %rax, (%rbx) %rbp memory mode Register mode 18
![Simple Memory Addressing Modes ¢ Normal (R) Mem[Reg[R]] § Register R specifies memory address Simple Memory Addressing Modes ¢ Normal (R) Mem[Reg[R]] § Register R specifies memory address](http://slidetodoc.com/presentation_image_h2/93775136d05ccb67005d5e52468b901e/image-19.jpg)
Simple Memory Addressing Modes ¢ Normal (R) Mem[Reg[R]] § Register R specifies memory address § Aha! Pointer dereferencing in C movq (%rcx), %rax ¢ Pretend that RAM is a big array named “Mem” Displacement D(R) Mem[Reg[R]+D] § Register R specifies start of memory region § Constant displacement D specifies offset movq 8(%rbp), %rdx 19

Simple Addressing Modes (cont) ¢ Immediate $Imm § The value Imm is the value that is used movq $4096, %rax ¢ Absolute Imm Mem[Imm] § No dollar sign before the number § The number is the memory address to use movq ¢ 4096, %rdx The book has more details on addressing modes!! 20

mov instructions Instruction Effect D S Description movq S, D Move quad word movl S, D D S Move double word movw S, D D S Move word movb S, D D S Move byte movsbl S, D D Sign. Extend (S) Move sign-extended byte movzbl S, D D Zero. Extend Move zero-extended byte Notes: 1. byte movements must use one of the 8 single-byte registers 2. word movements must use one of the 8 2 -byte registers 3. movsbl takes single byte source, performs sign-extension on high-order 24 bits, copies the resulting double word to dest. 4. movzbl takes single byte source, performs adds 24 0’s to high-order bits, copies the resulting double word to dest 21

mov instruction example ¢ 1. 2. 3. Assume that %dh = 8 D and %eax = 98765432 at the beginning of each of these instructions instruction movb %dh, %al movsbl %dh, %eax movzbl %dh, %eax result %eax = 9876548 D %eax = FFFFFF 8 D %eax = 0000008 D 22

mov instruction example ¢ 1. 2. 3. 4. 5. movq movq instruction addressing mode $0 x 4050, %eax Imm Reg %ebp, %esp Reg (%ecx), %eax Mem Reg $-17, (%esp) Imm Mem %eax, -12(%ebp) Reg Mem (Displacement 23
- Slides: 23