Systems Architecture Lecture 4 Compilers Assemblers Linkers Loaders
Systems Architecture Lecture 4: Compilers, Assemblers, Linkers & Loaders Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some material drawn from CMU CSAPP Slides: Kesden and Puschel Lec 4 Systems Architecture 1
Introduction • Objective: To introduce the role of compilers, assemblers, linkers and loaders. To see what is underneath a C program: assembly language, machine language, and executable. Lec 4 Systems Architecture 2
Compilation Process Lec 4 Systems Architecture 3
Compilation Process Lec 4 Systems Architecture 4
Below Your Program Example from a Unix system • • • Source Files: count. c and main. c Corresponding assembly code: count. s and main. s Corresponding machine code (object code): count. o and main. o Library functions: libc. a Executable file: a. out • format for a. out and object code: ELF (Executable and Linking Format) Lec 4 Systems Architecture 5
Producing an Executable Program Example from a Unix system (SGI Challenge running IRIX 6. 5) • Compiler: count. c and main. c count. s and main. s – gcc -S count. c main. c • Assembler: count. s and main. s count. o and main. o – gcc -c count. s main. s – as count. s -o count. o • Linker/Loader: count. o main. o libc. a a. out – gcc main. o count. o – ld main. o count. o -lc (additional libraries are required) Lec 4 Systems Architecture 6
Source Files void main() { int n, s; int count(int n) { int i, s; printf("Enter upper limit: "); scanf("%d", &n); s = count(n); printf("Sum of i from 1 to %d = %dn", n, s); s = 0; for (i=1; i<=n; i++) s = s + i; return s; } } Lec 4 Systems Architecture 7
Assembly Code for MIPS (count. s) #. file 1 "count. c". option pic 2. section. text. align 2. globl count. ent count: . LFB 1: . frame $fp, 48, $31 # vars= 16, regs= 2/0, args= 0, extra= 1 6. mask 0 x 50000000, -8. fmask 0 x 0000, 0 subu $sp, 48. LCFI 0: sd $fp, 40($sp) Lec 4 Systems Architecture 8
L 6: . LCFI 1: sd $28, 32($sp). LCFI 2: move $fp, $sp. LCFI 3: . set noat lui $1, %hi(%neg(%gp_rel(count))) addiu $1, %lo(%neg(%gp_rel(count))) daddu $gp, $1, $25. set at sw $4, 16($fp) sw $0, 24($fp) li $2, 1 # 0 x 1 sw $2, 20($fp). L 3: lw $2, 20($fp) lw $3, 16($fp) slt $2, $3, $2 beq $2, $0, . L 6 b. L 4 Lec 4 Systems Architecture lw $2, 24($fp) lw $3, 20($fp) addu $2, $3 sw $2, 24($fp). L 5: lw $2, 20($fp) addu $3, $2, 1 sw $3, 20($fp) b. L 3. L 4: lw $3, 24($fp) move $2, $3 b. L 2: move $sp, $fp ld $fp, 40($sp) ld $28, 32($sp) addu $sp, 48 j $31. LFE 1: . end count 9
Executable Program for MIPS (a. out) 0000000 7 f 45 4 c 46 0102 0100 0000 0000020 0002 0008 0000 0001 1000 1060 0000 0034 0000040 0000 6 c 94 2000 0024 0034 0020 0007 0028 0000060 0023 0022 0000 0006 0000 0034 1000 0034 0000100 1000 0034 0000 00 e 0 0000 0004 0000120 0004 0000 0003 0000 0114 1000 0114 0000140 1000 0114 0000 0015 0000 0004 0000160 0001 7000 0002 0000 0130 1000 0130 0000200 1000 0130 0000 0080 0004 0000220 0008 7000 0000 01 b 0 1000 01 b 0 0000240 1000 01 b 0 0000 0018 0000 0004 0000260 0004 0000 0002 0000 01 c 8 1000 01 c 8 0000300 1000 01 c 8 0000 0108 0000 0004 0000320 0004 0000 0001 0000 1000 0000340 1000 0000 3000 0000 0005 • • • • • • Lec 4 Systems Architecture 10
Assembly Characteristics: Data Types • “Integer” data of 1, 2, or 4 bytes – Data values – Addresses (untyped pointers) • Floating point data of 4, 8, or 10 bytes • No aggregate types such as arrays or structures – Just contiguously allocated bytes in memory Lec 4 Systems Architecture 11
Assembly Characteristics: Operations • Perform arithmetic function on register or memory data • Transfer data between memory and register – Load data from memory into register – Store register data into memory • Transfer control – Unconditional jumps to/from procedures – Conditional branches Lec 4 Systems Architecture 12
Object Code for sum 0 x 401040 0 x 55 0 x 89 0 xe 5 0 x 8 b 0 x 45 0 x 0 c 0 x 03 0 x 45 0 x 08 0 x 89 0 xec 0 x 5 d 0 xc 3 Lec 4 • Assembler – – <sum>: Translates. s into. o Binary encoding of each instruction Nearly-complete image of executable code Missing linkages between code in different files • Linker – Resolves references between files – Combines with static run-time libraries • E. g. , code for malloc, printf • Total of 13 bytes • Each instruction 1, 2, or 3 bytes – Some libraries are dynamically linked • Linking occurs when program begins execution • Starts at address 0 x 401040 Systems Architecture 13
Disassembling Object Code Disassembled 00401040 <_sum>: 0: 55 1: 89 e 5 3: 8 b 45 0 c 6: 03 45 08 9: 89 ec b: 5 d c: c 3 d: 8 d 76 00 push mov add mov pop ret lea %ebp %esp, %ebp 0 xc(%ebp), %eax 0 x 8(%ebp), %eax %ebp, %esp %ebp 0 x 0(%esi), %esi • Disassembler objdump -d p – – Lec 4 Useful tool for examining object code Analyzes bit pattern of series of instructions Produces approximate rendition of assembly code Can be run on eithera. out (complete executable) or. o file Systems Architecture 14
Alternate Disassembly Disassembled Object 0 x 401040: 0 x 55 0 x 89 0 xe 5 0 x 8 b 0 x 45 0 x 0 c 0 x 03 0 x 45 0 x 08 0 x 89 0 xec 0 x 5 d 0 xc 3 Lec 4 0 x 401040 0 x 401041 0 x 401043 0 x 401046 0 x 401049 0 x 40104 b 0 x 40104 c 0 x 40104 d <sum>: <sum+1>: <sum+3>: <sum+6>: <sum+9>: <sum+11>: <sum+12>: <sum+13>: push mov add mov pop ret lea %ebp %esp, %ebp 0 xc(%ebp), %eax 0 x 8(%ebp), %eax %ebp, %esp %ebp 0 x 0(%esi), %esi • Within gdb Debugger gdb p disassemble sum – Disassemble procedure x/13 b sum – Examine the 13 bytes starting atsum Systems Architecture 15
What Can be Disassembled? % objdump -d WINWORD. EXE: file format pei-i 386 No symbols in "WINWORD. EXE". Disassembly of section. text: 30001000 <. text>: 30001000: 55 30001001: 8 b ec 30001003: 6 a ff 30001005: 68 90 10 00 30 3000100 a: 68 91 dc 4 c 30 push mov push %ebp %esp, %ebp $0 xffff $0 x 30001090 $0 x 304 cdc 91 • Anything that can be interpreted as executable code • Disassembler examines bytes and reconstructs assembly source Lec 4 Systems Architecture 16
- Slides: 16