Computer Architecture and Assembly Language Practical Session 1
Computer Architecture and Assembly Language Practical Session 1
Data Representation Basics • Bit – basic information unit: (1/0) • Byte – sequence of 8 bits: 7 6 5 4 3 2 1 0 MSB (Most Significant Bit) LSB (Least Significant Bit) 232 -1 • Main Memory is an array of bytes, addressed by 0 to 232 -1=0 x. FFFF 232 bytes = 4∙ 210∙ 3 bytes = 4 G bytes … 2 K-1 address space … 1 0 • Word – a sequence of bits addressed as a single entity by the computer byte 16 bit word physical memory
Registers Register file - CPU unit which contains (32 bit) registers. general purpose registers EAX, EBX, ECX, EDX Register file (Accumulator, Base, Counter, Data) index registers ESP, EBP, ESI, EDI (Stack pointer - contains the address of last used dword in the stack, Base pointer, Source index, Destination Index) flag register / status register EFLAGS Extended High byte Low byte 16 -bit register Instruction Pointer / Program Counter EIP / EPC - contains address (offset) of the next instruction that is going to be executed (at at run time) time - - changed by unconditional jump, procedure call, and return instructions - Note that the list of registers above is partial. The full list can be found here.
Assembly Language Program • • • consists of a series of processor instructions, meta-statements, comments, and data translated by assembler into machine language instructions (binary code) that can be loaded into memory and executed NASM - Netwide Assembler - is assembler and for x 86 architecture Example: assembly code MOV AL, 61 h ; load AL with 97 decimal (61 hex) binary code: binary code 101100001 1011 0 000 01100001 a binary code (opcode) of instruction 'MOV' specifies if data is byte (‘ 0’) or full size 16/32 bits (‘ 1’) a binary identifier for a register 'AL' a binary representation of 97 decimal (97 d = (int)(97/16)*10 + (97%16 converted to hex digit) = 61 h)
Basic Assembly Instruction Structure label: (pseudo) instruction operands ; comment optional fields either required or forbidden by an instruction RAM each instruction has its offset (address) • • we mark an instruction with a label to refer it in the code (non-local) labels have to be unique an instruction that follows a label can be at the same / next line colon is optional buffer 2048 64 bytes • Examples: … mov ax, 2 ; moves constant 2 to the register ax buffer: resb 64 ; reserves 64 bytes Notes: - backslash () : if a line ends with backslash, the next line is considered to be a part of the backslash-ended line - no restrictions on white space within a line mov buffer, 2 = mov 2048, 2 mov [buffer], 2 = mov [2048], 2 mov byte [buffer], 2 = mov byte [2048], 2
Instruction Arguments A typical instruction has 2 operands - target operand (left) - source operand (right) 3 kinds of operands exists - immediate : value - register : AX, EBP, DL etc. - memory location : variable or pointer Examples: mov ax, 2 target operand register source operand immediate mov [buffer], ax target operand memory location ! Note that x 86 processor does not allow both operands be memory locations. source operand register mov [var 1], [var 2]
MOV - Move Instruction – copies source to destination mov reg 8/mem 8(16, 32), reg 8/imm 8(16, 32) (copies content of register / immediate (source) to register / memory location (destination)) mov reg 8(16, 32), reg 8/mem 8(16, 32) (copies content of register / memory location (source) to register (destination)) operands have to be of the same size Examples: mov eax, 0 x 2334 AAFF reg 32 imm 32 mov [buffer], ax mem 16 reg 16 mov word [var], 2 mem 16 imm 16 Note that NASM doesn’t remember the types of variables you declare. It will deliberately remember nothing about the symbol var except where it begins, and so you must explicitly code mov word [var], 2.
Basic Arithmetical Instruction <instruction> reg 8/mem 8(16, 32), reg 8/imm 8(16, 32) (source - register / immediate, destination- register / memory location) <instruction> reg 8(16, 32), reg 8/mem 8(16, 32) (source - register / immediate, destination - register / memory location) ADD - add integers SUB - subtract integers Example: add AX, BX ; (AX gets a value of AX+BX) Example: sub AX, BX ; (AX gets a value of AX-BX) ADC - add integers with carry SBB - subtract with borrow (value of Carry Flag) Example: adc AX, BX ; (AX gets a value of AX+BX+CF) Example: sbb AX, BX ; (AX gets a value of AX-BX-CF)
Basic Arithmetical Instruction <instruction> reg 8/mem 8(16, 32) (source / destination - register / memory location) INC - increment integer Example: inc AX ; (AX gets a value of AX+1) DEC - increment integer Example: dec byte [buffer] ; ([buffer] gets a value of [buffer] -1)
Basic Logical Instructions <instruction> reg 8/mem 8(16, 32) (source / destination - register / memory location) NOT – one’s complement negation – inverts all the bits Example: mov al, 11111110 b not al ; (AL gets a value of 00000001 b) ; (11111110 b + 00000001 b = 1111 b) NEG – two’s complement negation – inverts all the bits, and adds 1 Example: mov al, 11111110 b neg al ; (AL gets a value of not(11111110 b)+1=00000001 b+1=00000010 b) ; (11111110 b + 00000010 b = 10000 b = 0)
Basic Logical Instructions <instruction> reg 8/mem 8(16, 32), reg 8/imm 8(16, 32) (source - register / immediate, destination- register / memory location) <instruction> reg 8(16, 32), reg 8/mem 8(16, 32) (source - register / immediate, destination - register / memory location) OR – bitwise or – bit at index i of the destination gets ‘ 1’ if bit at index i of source or destination are ‘ 1’; otherwise ‘ 0’ Example: mov al, 11111100 b mov bl, 00000010 b or AL, BL ; (AL gets a value 11111110 b) AND– bitwise and – bit at index i of the destination gets ‘ 1’ if bits at index i of both source and destination are ‘ 1’; otherwise ‘ 0’ Example: or AL, BL ; (with same values of AL and BL as in previous example, AL gets a value 0)
CMP – Compare Instruction – compares integers CMP performs a ‘mental’ subtraction - affects the flags as if the subtraction had taken place, but does not store the result of the subtraction. cmp reg 8/mem 8(16, 32), reg 8/imm 8(16, 32) (source - register / immediate, destination- register / memory location) cmp reg 8(16, 32), reg 8/mem 8(16, 32) (source - register / immediate, destination - register / memory location) Examples: mov al, 11111100 b mov bl, 00000010 b cmp al, bl ; (ZF (zero flag) gets a value 0) mov al, 11111100 b mov bl, 11111100 b cmp al, bl ; (ZF (zero flag) gets a value 1)
JMP – unconditional jump jmp label JMP tells the processor that the next instruction to be executed is located at the label that is given as part of jmp instruction. Example: this is infinite loop ! mov eax, 1 inc_again: inc eax jmp inc_again mov ebx, eax this instruction is never reached from this code
J<Condition> – conditional jump j<cond> label • • execution is transferred to the target instruction only if the specified condition is satisfied usually, the condition being tested is the result of the last arithmetic or logic operation Example: mov eax, 1 inc_again: inc eax cmp eax, 10 jne inc_again mov eax, 1 inc_again: ; if eax ! = 10, go back to loop inc eax cmp eax, 10 je end_of_loop ; if eax = = 10, jump to end_of_loop jmp inc_again ; go back to loop end_of_loop:
Jcc: Conditional Branch Instruction JO JNO JS JNS JE JZ JNE JNZ JB JNAE JC JNB JAE JNC JBE JNA JA JNBE JL JNGE JNL JLE JNG JG JNLE JP JPE JNP JPO JCXZ JECXZ Description Jump if overflow Jump if not overflow Jump if sign Jump if not sign Jump if equal Jump if zero Jump if not equal Jump if not zero Jump if below Jump if not above or equal Jump if carry Jump if not below Jump if above or equal Jump if not carry Jump if below or equal Jump if not above Jump if not below or equal Jump if less Jump if not greater or equal Jump if not less Jump if less or equal Jump if not greater Jump if not less or equal Jump if parity even Jump if not parity Jump if parity odd Jump if CX register is 0 Jump if ECX register is 0 Flags OF = 1 OF = 0 SF = 1 SF = 0 ZF = 1 ZF = 0 CF = 1 CF = 0 CF = 1 or ZF = 1 CF = 0 and ZF = 0 SF <> OF SF = OF ZF = 1 or SF <> OF ZF = 0 and SF = OF PF = 1 PF = 0 CX = 0 ECX = 0
d<size> – declare initialized data d<size> initial value Examples: var: db var: dw var: dd Pseudo-instruction <size> filed <size> value DB byte 1 byte DW word 2 bytes DD double word 4 bytes DQ quadword 8 bytes DT tenbyte 10 bytes DDQ double quadword 16 bytes DO octoword 16 bytes 0 x 55, 0 x 56, 0 x 57 'a‘ 'hello', 13, 10, '$‘ 0 x 1234 ‘A' ‘AB‘ ‘ABC' 0 x 12345678 ; define a variable ‘var’ of size byte, initialized by 0 x 55 ; three bytes in succession ; character constant 0 x 61 (ascii code of ‘a’) ; string constant ; 0 x 34 0 x 12 ; 0 x 41 0 x 00 – complete to word ; 0 x 41 0 x 42 0 x 43 0 x 00 – complete to word ; 0 x 78 0 x 56 0 x 34 0 x 12
Assignment 0 You get a simple program that receives a string from the user. Than, it calls to a function (that you’ll implement in assembly) that receives one string as an argument and should do the following: • • Convert lower case to upper case. Convert ‘(’ into ‘<’. Convert ‘)’ into ‘>’. Count the number of the non-letter characters, that is ANY character that is not 'a' ->'z' or 'A'->'Z‘, including ‘n’ character The function shall return the number of the letter characters in the string. The characters conversion should be in-place. Example: > 42: he. LL() Wor. Ld! > 42: HELL<> WORLD! > 9
main. c #include <stdio. h> # define MAX_LEN 100 /* Maximal line size */ extern int do_Str (char*); int main(void) { char str_buf[MAX_LEN]; int counter = 0; fgets(str_buf, MAX_LEN, stdin); /* Read user's command line string */ counter = do_Str (str_buf); /* Your assembly code function */ printf("%s%dn", str_buf, counter); return 0; }
myasm. s section. data an: DD 0 ; data section, read-write ; this is a temporary var section. text global do_Str extern printf ; ; our code is always in the. text section makes the function appear in global scope tell linker that printf is defined elsewhere (not used in the program) do_Str: ; ; ; functions are defined as labels save Base Pointer (bp) original value use base pointer to access stack contents push all variables onto stack get function argument push ebp mov ebp, esp pushad mov ecx, dword [ebp+8] ; ; ; ; FUNCTION EFFECTIVE CODE STARTS HERE ; ; ; ; mov dword [an], 0 label_here: label_here ; initialize answer ; Your code goes somewhere around here. . . inc ecx cmp byte [ecx], 0 jnz label_here ; ; ; ; popad mov pop ret ; increment pointer ; check if byte pointed to is zero ; keep looping until it is null terminated FUNCTION EFFECTIVE CODE ENDS HERE ; ; ; ; ; restore all previously used registers eax, [an] ; return an (returned values are in eax) esp, ebp
Running NASM To assemble a file, you issue a command of the form > nasm -f <format> <filename> [-o <output>] [ -l listing] Example: > nasm -f elf myasm. s -o myelf. o It would create myelf. o file that has elf format (executable and linkable format). We use main. c file (that is written in C language) to start our program, and sometimes also for input / output from a user. So to compile main. c with our assembly file we should execute the following command: gcc –m 32 main. c myelf. o -o myexe. out The -m 32 option is being used to comply with 32 - bit environment It would create executable file myexe. out. In order to run it you should write its name on the command line: > myexe. out
How to run Linux from Window Go to http: //www. chiark. greenend. org. uk/~sgtatham/putty/download. html Run the following executable Use “lvs. cs. bgu. ac. il” or “lace. cs. bgu. ac. il” host name and click ‘Open’ Use your Linux username and password to login lace server Go to http: //www. cs. bgu. ac. il/facilities/labs. html Choose any free Linux computer Connect to the chosen computer by using “ssh –X cs 302 six 1 -4” (maybe you would be asked for your password again) cd (change directory) to your working directory
Ascii table
- Slides: 22