System Programming Chapter 6 IA Assembly Programming October
System Programming Chapter 6. IA Assembly Programming October 23, 2017 Jongmoo Choi Dept. of Software Dankook University choijm@dankook. ac. kr http: //embedded. dankook. ac. kr/~choijm
Chapter Objectives Understand various viewpoints about CPU Apprehend the concept of ISA (Instruction Set Architecture) Learn the IA Register model Learn the IA Memory model Learn the IA Program model ü ü Know about the format of IA assembly instruction Make a program with IA assembly language Refer to Chapter 2 and 3 in the main textbook 2
Introduction (1/2) Summarizing what we have learnt ü ü Program development: compile, linking, ELF, … Program execution: task (text, data, stack), load, fetch, … § text: consists of machine instructions 버스 3
Introduction (2/2) Assembly language ü Language hierarchy § locate between high-level language and machine language ü Symbolic (mnemonic) representation of machine language § Ono-to-one mapping, CPU dependent (Not easy) ü Application field § Hardware control: system initialization, device driver, interrupt handler, embedded systems, ECU, Wearable computer, … § Vulnerability test (Virus identification, IDS) § Optimization § SW copyright protection, Similarity analysis, … ü Importance § Making a program, debugging, analyzing binary § Understand the behavior of hardware (especially CPU) § Grape the mechanism how hardware and software cooperated (hardware software co-design) 4
CPU (1/5) What is a Processor? Abstraction 5
CPU (2/5) Various Viewpoints of Processor ü Transistor + Gate + Logic + Clock ü ALU (Arithmetic Logic Unit) + Registers + CU (Control Unit) + BUS ü Instruction Set Architecture (CISC, RISC, VLIW, EPIC, …) Performance Characteristics (Pipeline, Superscalar, Cache, …) ü 6
CPU (3/5) Instruction Set Architecture: Register + Instructions 0 x. FF address data registers processor instructions and data memory ü ü ü 0 x 00. . 00 Register model Memory model Instruction model 7
CPU (4/5) Performance Characteristics: Pipeline, Superscalar, Cache Ifet Dec Dfet Exe Res For efficient pipeline • Similar latency of instructions (not complex) • Conflict between I. fetch and D. fetch • Branch prediction, Out-of order executions • L 1, L 2 cache … 8 Abbreviation Ifet: Instruction fetch Dec: Decode Dfet: Data fetch Exe: Execution Res: Results write
CPU (5/5) Performance Characteristics: Pipeline, Superscalar, Cache 8086 Pentium 9
Register Model (1/3) Register definition ü ü A small amount of memory available in a CPU Can be accessed quickly, compared with main memory IA registers (Source: Intel 64 and IA-32 Architectures SW Developer’s Manual, Volume 1: Basic Architecture, Chapter 3. 4) 10
Register Model (2/3) Functionality of each register ü Segment register § § ü General purpose register § § § § § ü ü CS(code segment): the base location of all executable instructions DS(data segment): the base location for variables SS(stack segment): the base location of the stack ES(extra segment): an additional base location for variables EAX (accumulator): for arithmetic operation (operand result data) EBX (base): pointer to data in the DS segment ECX (counter): counter for loop and string operations EDX (data): I/O pointer, a special role in multiply and divide operations ESP (stack pointer): pointer to the top of the stack EBP (base pointer): used as base for accessing variables on the stack ESI (source index): source pointer for string operations EDI (destination index): destination pointer for string operations Having its specialty, but commonly being used for general purpose EIP (instruction pointer): role of PC(Program counter) EFLAGS: Control and Status Register rax, rbx, rip, … for Intel 64 11
Register Model (3/3) Details of EFLAGS register ü Set of control and status Flags Refer to the IA-32 Basic Architecture, Chapter 3. 4. 3 for the role of each bit F Intel CPU has several additional registers such as CR 0, CR 2, CR 3, IDTR, GDTR, debugging registers, FPU registers, and MMX registers. (see LN_chapter 7) 12
Memory Model (1/4) Memory abstraction in IA ü ü ü logical address (virtual address) linear address physical address logical address segmentation stack data text logical memory (virtual memory) Segment Descriptor Table linear address physical address paging data Page 1 text Page 2 Page 3 stack Page Table text data stack linear memory 13 Page 1 Page 6 Page 2 Page 3 Page 5 Page 4 physical memory
Memory Model (2/4) Paging and Segmentation in detail 14
Memory Model (Optional) (3/4) Segmentation on IA ü ü ü Real Address Model: 8086 compatible, support 1 MB (seg. <<4+offset) Flat Model: protected mode with segment descriptor Segmented Model: protected mode with segment descriptor table real address model segmented model Think about the relation between task structure and segments 15
Memory Model (Optional) (4/4) Paging on IA ü 32 bit: 2 -level paging § Page directory, page table ü 64 bit: 4 -level paging § PML 4, page directory pointer, page directory, page table 32 bit CPU 64 bit CPU (Source: Intel 64 and IA 32 Architectures SW Developer’s Manual, Volume 3: System Programming Guide, Chapter 4) The basic concept of address mapping is similar to the indexing in the inode We can configure paging or segmentation only model 16
Instruction Model (1/2) Instruction format here: movl 0 x 8049388, %eax addl 0 x 8049384, %eax movl %eax, 0 x 804946 c (Source: Intel 64 and IA 32 Architectures SW Developer’s Manual, Volume 1: Basic Architecture, Chapter 1) 17
Instruction Model (2/2) Instruction Set (Opcode Set) summary ü General Purpose § § § ü Data Transfer Instruction: MOV, CMOVNZ, XCHG, PUSH, POP Arithmetic Instruction: ADD, SUB, MUL, DIV, DEC, INC, CMP Logical Instruction: AND, OR, XOR, NOT Shift and Rotate Instruction: SAR, SAL, SHR, SHL, ROR, ROL Bit and Byte Instruction: BT, BTS, BTC Control Transfer Instruction: JMP, JE, JZ, JNE, LOOP Function related Instruction: CALL, RET, LEAVE String Instruction: MOVS, CMPS, LODS Flag Control Instruction: STC, CLC, STD, CLD, STI, CLI Segment Register Instruction: LDS, LES Miscellaneous: INT, NOP, CPUID Special Purpose § § FPU Instruction: FLD, FST, FADD, FSUB, FCOM SIMD Instruction (MMX) : MOVD, MOVQ, PADD, PSUB SSE Instruction: MOVSS, ADDSS System Instruction: LGDT, SGDT, LIDT, … 18
Instruction Detail (1/14) Data Transfer Instruction ü Using gcc –S with the version 3. 4. 6 (Since the obfuscation techniques employed in gcc version 4. * make learning rather complex) operand : reg, mem, literal reg: begin with % memory: alphanumeric literal: begin with $ comments: # or /* */ what if we execute “movl 2, a”? what if we do “gcc –S –O 3 move_exam. c? ” what 19 if we do “gcc –S –O 3 move_exam. c with local variables? ”
Instruction Detail (2/14) Data Transfer Instruction (cont’) Basic opcode(mov) + suffix [l|w|b|q] b: byte (1 byte) w: word (2 bytes) l: long (double) word (4 bytes) q: quad word (8 byte) (refer to Figure 3. 1 in the main text ) Differences between AT&T and MASM 1. In MASM, the order of source and destination in operands is reversed (opcode destination, source). (eg. add eax, dword ptr [a]) 2. MASM uses byte ptr, word ptr, dword ptr instead of [b, w, l]. 3. Memory addressing: [] vs () 4. MASM uses call far and ret far, instead of lcall and lret 5. MASM does not use the prefix $, %. 6. Miscellaneous (conversion, addressing, multi-section, …) 20
Instruction Detail (3/14) Arithmetic Instruction “movl a, %eax” “subl b, %eax” “movl %eax, c” are also feasible (cf. load-store architecture) mul: multiply operand with eax result is stored in edx: eax div: divide edx: eax by operand the quotient is stored in eax, while the remainder is in edx What if b=0 x 40000001; ? See Chapter 2. 3 in the main text for the details of integer arithmetic 21
Instruction Detail (4/14) Control Transfer Instruction: if Compare instruction: Perform subtraction, but not store the result (only bits in EFLAGS are changed) Types of jmp instruction: jmp, je, jne, jge, jle, … Jump to the label. L 2 if (SF == 1 or ZF == 1) (EIP =. L 2) Otherwise, go to the next instruction (EIP = EIP +1). (precisely, if (SF == 1 or SF==OF)) Example of logic instruction switch statement: extension of “if else” statement 22
Instruction Detail (5/14) Control Transfer Instruction: for while, do while statements: another form of “for” statement 23
Instruction Detail (6/14) Control Transfer Instruction: stack revisit ü ü Stack operation: push and pop Stack management: bottom and top
Instruction Detail (7/14) Control Transfer Instruction: function stack frame for main 222 111 ret. address saved ebp EBP a b ESP = EBP. Then pop. (Eventually pop local variables and saved ebp from the stack) pop and set it into EIP (EIP = return address) Decrease ESP. Put operand on the stack. (cf. movl $222, 4(%esp) ) Push EIP. Jump to the operand (EIP = func 1). 64 bit CPU: make use of registers to pass parameters 25 (rdi, rsi, rdx, rcx, r 9, r 8) Pop arguments from the stack. Return value is in eax
Instruction Detail (8/14) Control Transfer Instruction: stack frame illustration
Instruction Detail (9/14) Practice 1: function example ü result = asm_sum(final_number), written by assembly language . text directive: declare text section (the following instructions are resided in the text section) . global directive: declare “asm_sum” visible to the linker Memory addressing: displacement(base) or displacement(base, index, scale) 27
Instruction Detail (10/14) Execution results of Practice 1 Use “make” utility when there a bunch of files 28
Instruction Detail (11/14) Practice 2: Standalone assembly program . data directive: declare data section. long directive: initialize 4 B memory space (address, initial value, expression, …) . string directive: initialize string (array of character) 29
Instruction Detail (12/14) directive ü ü Meta-statements (pseudo-instruction) Used for giving information to assembler (affect how the assembler operates. not directly executed on CPU) Begin with. (period) Representative directive § § § § § . file, . include. text, . data, . comm, . section. long, . byte, . string, . ascii, . float, . quad. global, . align, . size. set, . equal, . rept, . space. macro, . endm. if, . else, . endif. cfi_startproc, . cfi_endproc for debugging … refer to “GNU assembler” in the lecture site or “info as” on the Linux shell 30
Instruction Detail (13/14) Software Interrupt ü write() system call arguments system call index IDT table index 31
Instruction Detail (14/14) Software Interrupt (cont’) ü Interrupt and system call handling Kernel IDT sys_call_table (sysent[]) 0 x 0 divide_error() 0 sys_no_syscall() debug() 1 sys_exit() nmi() 2 sys_fork() 3 sys_read () 4 system_call() sys_write () …. 0 x 80 system_call() 47 sys_fork() sys_write() …. sys_getpid() …. 255 sys_no_syscall() 64 bit CPU: use “sysenter (syscall on AMD)” instead of “int” 32
Summary Understand ISA Know about IA register, memory, and instruction model Learn the format of IA instruction ü label, opcode, operands, comments Learn the types of IA opcode ü mov, add, cmp, jmp, push, call, ret, int, … Homework 6: Make an assembly program Requirements - print out the multiples of 4 in the range 1 ~ 100 (using loop 29 page) - using a function - shows student’s ID and date (using whoami and date) - hand out the report that includes the code, snapshot and discussion Warn: DO NOT utilize “gcc –S option” (easily detected) 33
Appendix 1: Assembly in Windows AT&T vs. MASM 34
Appendix 1: Assembly in Windows AT&T vs. MASM 35
Appendix 2: Assembly in x 64 Assembly in 64 bit OS ü ü ü Register extension (64 bit), new Registers Register based argument passing Use PIC (Position Independent Code) 36
- Slides: 36