assembly language source code assembler machine code 1

assembly language source code assembler "machine code" 1 These slides may be freely used, distributed, and incorporated into other works.

To execute a program: 1 put the "machine code" into memory 2 jal __start (the OS does this) memory "machine code" 2 These slides may be freely used, distributed, and incorporated into other works.

assembler's task 1. 2. 3. 3 assign addresses generate "machine code" (architecture dependent) further translation of assembly language source code These slides may be freely used, distributed, and incorporated into other works.

previous architectures 1 assembly language instruction 1 machine code instruction MIPS architecture 1 assembly language instruction 4 1 or more machine code instructions These slides may be freely used, distributed, and incorporated into other works.

This further translation is also called synthesis. MIPS example of synthesis: add $8, $9, -16 becomes addi $8, $9, -16 5 These slides may be freely used, distributed, and incorporated into other works.

Two operands in the source code add $8, $9 are expanded back out to become add 6 $8, $9 These slides may be freely used, distributed, and incorporated into other works.

integer multiplication and division each produce 2 32 -bit results integer division produces quotient remainder integer multiplication of 2 32 -bit operands produces a 64 -bit result 7 These slides may be freely used, distributed, and incorporated into other works.

MIPS hardware implements 2 extra registers (called HI and LO) to hold these results. Here are 4 more MIPS instructions: mflo R mtlo R mfhi R mthi R m move f from t to 8 lo register LO hi register HI These slides may be freely used, distributed, and incorporated into other works.

multiplication mul becomes mult mflo $8, $9, $10 $8 X HI 9 LO These slides may be freely used, distributed, and incorporated into other works.

division div becomes div mflo rem $8, $9, $10 $8 # quotient in LO $12, $13, $14 becomes div $13, $14 mfhi $12 # remainder in HI 10 These slides may be freely used, distributed, and incorporated into other works.

puts, putc, getc, and done are not TAL ! I/O is accomplished by requesting service from the operating system (OS). All architectures do this with a single instruction. On MIPS, this instruction is syscall (no operands) 11 These slides may be freely used, distributed, and incorporated into other works.

(note that this is specific to our simulator) To help the OS distinguish what service is required, $v 0 ($2) is set: 12 $v 0 I/O operation 11 putc 12 getc 4 puts 10 done These slides may be freely used, distributed, and incorporated into other works.

synthesis of puts $8 13 first pass final synthesis li addi $2, $0, 4 move $4, $8 add $4, $8, $0 syscall $2, 4 These slides may be freely used, distributed, and incorporated into other works.

lw becomes la lw $8, X $8, 0($8) Oops! la must be synthesized. 14 These slides may be freely used, distributed, and incorporated into other works.

synthesis of la $8, my_label requires the address assigned for my_label every address is assigned by the assembler MS part LS part 16 16 32 15 These slides may be freely used, distributed, and incorporated into other works.

la $8, my_label becomes lui $8, 0 x. MS part ori $8, 0 x. LS part 16 These slides may be freely used, distributed, and incorporated into other works.

after lui $8 $8, 0 x. MS part 000. . . 0 this is then logically ORed with 000. . . 0 LS part due to the instruction ori $8, 0 x. LS part resulting contents of $8: $8 17 MS part LS part These slides may be freely used, distributed, and incorporated into other works.

Synthesize lw $8, X Assume X is assigned address 0 xaaee 0018. first try: la $8, X lw $8, 0($8) with synthesis of the la instruction: lui $8, 0 xaaee ori $8, 0 x 0018 lw $8, 0($8) 18 These slides may be freely used, distributed, and incorporated into other works.

Synthesize sb $12, X Assume X is assigned address 0 x 080001 a 0. 19 These slides may be freely used, distributed, and incorporated into other works.

Generate machine code for addi $8, $20, 15 From the TAL table: addi Rt, Rs, I Rt is $8 Rs is $20 I is 0000 1111 0010 00 ss ssst tttt ii. . ii op code 16 bits sssss is 10100 (for $20) ttttt is 01000 (for $8) 0010 1000 0000 1111 in hex 0 x 2288000 f 20 These slides may be freely used, distributed, and incorporated into other works.

Generate machine code for lw $8, 12($sp) lw Rt, I(Rb) Rt is $8 Rb is $sp (which is $29) I is 12 1000 11 bb bbbt tttt ii. . ii op code 16 bbbbb is 11101 value 12 ttttt is 01000 1111 1010 1000 0000 1100 in hex 0 x 8 fa 8000 c 21 These slides may be freely used, distributed, and incorporated into other works.

assembly language source code assembler assign addresses produce machine code memory image 22 These slides may be freely used, distributed, and incorporated into other works.

Problem: forward references. text beq $8, $11, later_in_code: lw $20, X . data X: . word 23 16 These slides may be freely used, distributed, and incorporated into other works.

Simple solution: 2 -pass assembler first pass: (MIPS-only) MAL TAL synthesis assign all addresses second pass: produce all machine code More complex and more efficient: 1 -pass assembler Keep a list of instructions that cannot be completed due to yet-to-be-assigned addresses. As addresses are assigned, check the list and complete instructions. 24 These slides may be freely used, distributed, and incorporated into other works.

assign all addresses (and remember them) implies the use of a table holding the mapping of addresses to labels called a symbol table 25 These slides may be freely used, distributed, and incorporated into other works.

As the assembler works on the source code, it scans the characters in the file. Scanner (a SW module) breaks a set of characters into significant groups known as tokens often, tokens are separated by white space or special punctuation . data a 1: . word loop: 26 lw 3 $7, 4($6) These slides may be freely used, distributed, and incorporated into other works.

. data a 1: . word 3 a 2: . word 16: 4 a 3: . word 5. text __start: la $6, a 2 loop: lw $7, 4($6) mult $9, $10 b loop done 27 These slides may be freely used, distributed, and incorporated into other works.

2 segments: code and data The assembler places items into these 2 segments. So, it needs addresses. Use starting addresses of data 0 x 0040 0000 code 0 x 0080 0000 The variable internal to the assembler that represents the next address to be assigned is the location counter. 28 These slides may be freely used, distributed, and incorporated into other works.

TAL equivalent of code: . text __start: lui $6, 0 x 0040 # la $6, a 2 ori $6, 0 x 0004 loop: lw $7, 4($6) mult $9, $10 beq $0, loop # b loop ori $2, $0, 10 # done syscall 29 These slides may be freely used, distributed, and incorporated into other works.

As a result of processing the entire. data section, the memory image will be address contents 0 x 0040 0000 0 x 0000 0003 0 x 0040 0004 0 x 0000 0010 0 x 0040 0008 0 x 0000 0010 0 x 0040 000 c 0 x 0000 0010 0 x 0040 0010 0 x 0000 0010 0 x 0040 0014 0 x 0000 0005 30 These slides may be freely used, distributed, and incorporated into other works.

. data a 1: . word 3 a 2: . word 16: 4 a 3: . word 5. text __start: la $6, a 2 loop: lw $7, 4($6) mult $9, $10 b loop done 31 These slides may be freely used, distributed, and incorporated into other works.

Machine code for la $6, a 2 Synthesized: (1) lui $6, 0 x 0040 (2) ori $6, 0 x 0004 (address from symbol table) R t, I (1) Rt is $6 0011 1100 000 t tttt ii. . ii lui op code 16 ttttt is 00110 0011 1100 0000 0110 0000 0100 0000 in hex 0 x 3 c 060040 32 These slides may be freely used, distributed, and incorporated into other works.

Add to the memory image address contents 0 x 0080 0000 0 x 3 c 06 0040 33 These slides may be freely used, distributed, and incorporated into other works.

Machine code for ori $6, 0 x 0004 Rt, Rs, I (2) Rt is $6 Rs is $6 0011 01 ss ssst tttt ii. . ii ori op code 16 ttttt is 00110 sssss is 00110 0011 0100 1100 0110 0000 0100 in hex 0 x 34 c 60004 34 These slides may be freely used, distributed, and incorporated into other works.

Add it to the memory image as well, updating the location counter address contents 0 x 0080 0000 0 x 3 c 06 0040 0 x 0080 0004 0 x 34 c 6 0004 35 These slides may be freely used, distributed, and incorporated into other works.

. data a 1: . word 3 a 2: . word 16: 4 a 3: . word 5. text __start: la $6, a 2 loop: lw $7, 4($6) mult $9, $10 b loop done 36 These slides may be freely used, distributed, and incorporated into other works.

Scanning on, machine code for lw $7, 4($6) lw Rt, I(Rb) Rt is $7 Rb is $6 I is 4 1000 11 bb bbbt tttt ii. . ii op code 16 1000 1100 0111 0000 0100 in hex 0 x 8 cc 70004 37 These slides may be freely used, distributed, and incorporated into other works.

Add it to the memory image as well, updating the location counter address contents 0 x 0080 0000 0 x 3 c 06 0040 (lui) 0 x 0080 0004 0 x 34 c 6 0004 (ori) 0 x 0080 0008 0 x 8 cc 7 0004 (lw) 38 These slides may be freely used, distributed, and incorporated into other works.

. data a 1: . word 3 a 2: . word 16: 4 a 3: . word 5. text __start: la $6, a 2 loop: lw $7, 4($6) mult $9, $10 b loop done 39 These slides may be freely used, distributed, and incorporated into other works.

next comes mult $9, $10 Rs 01001 Rt 01010 Rd op code 0000 00 ss ssst tttt 0000 0001 1000 0001 0010 1010 0000 0001 1000 in hex 40 0 x 012 a 0018 These slides may be freely used, distributed, and incorporated into other works.

op code 000000 is used for any arithmetic or logical instruction with 3 register operands 0000 00 ss ssst tttt dddd d? ? ? which operation 41 These slides may be freely used, distributed, and incorporated into other works.

Add mult to the memory image as well, updating the location counter address contents 0 x 0080 0000 0 x 3 c 06 0040 (lui) 0 x 0080 0004 0 x 34 c 6 0004 (ori) 0 x 0080 0008 0 x 8 cc 7 0004 (lw) 0 x 0080 000 c 0 x 012 a 0018 (mult) 42 These slides may be freely used, distributed, and incorporated into other works.

. data a 1: . word 3 a 2: . word 16: 4 a 3: . word 5. text __start: la $6, a 2 loop: lw $7, 4($6) mult $9, $10 b loop done 43 These slides may be freely used, distributed, and incorporated into other works.

b loop is a pseudoinstruction (must be synthesized) Many translations: beq $0, loop bgez $0, loop blez $0, loop j loop 44 These slides may be freely used, distributed, and incorporated into other works.

beq op code $0, loop Rs Rt 00000 I 0001 00 ss ssst tttt iii. . . ii I is a derivation of an offset. 45 These slides may be freely used, distributed, and incorporated into other works.

At run (execution) time, for a taken branch I (from instruction) I || 00 (concatenate) I || 00 (sign extend to 32 bits) + PC 46 These slides may be freely used, distributed, and incorporated into other works.

I computed by the assembler relies on byte difference = target address - branch address except, when the PC (the branch address!) is used (at execution time), the PC update step (of the fetch and execute cycle) has already been completed. So, byte difference 47 = target address - branch + 4 address These slides may be freely used, distributed, and incorporated into other works.

from the symbol table target is loop 0 x 0080 0008 beq is at 0 x 0080 0010 byte offset = 0 x 00800008 – ( 0 x 00800010 + 4 ) 0000 1000 0000 1000 - 0000 1000 0000 0001 0100 (can't do this in unsigned, so convert to 2's complement) 1111 0111 1111 1110 1100 additive inverse of 48 These slides may be freely used, distributed, and incorporated into other works.

0000 1000 0000 1000 + 1111 0111 1111 1110 1100 1111 1111 0100 this represents -12 is the byte offset to be added to the PC to form the new (correct) target PC 49 These slides may be freely used, distributed, and incorporated into other works.

Recall that At run (execution) time, for a taken branch I (from instruction) I || 00 (concatenate) I || 00 (sign extend to 32 bits) + PC 50 These slides may be freely used, distributed, and incorporated into other works.

instructions are all 32 bits = 4 bytes addresses of all instructions xx. . . xx 00 for example, 0, 4, 8, 12, 16. . . So, remove the zeros at assembly time and put them back in at execution time. 18 bits of offset for 16 bits of instruction space 51 These slides may be freely used, distributed, and incorporated into other works.

back to the beq instruction: -12 is 1111 1111 0100 eliminated 1111 1111 01 16 bit I field of instruction 0001 00 ss ssst tttt ii. . . ii 00 0 0000 1111 1101 in hex 0 x 1000 fffd 52 These slides may be freely used, distributed, and incorporated into other works.

Add beq to the memory image, updating the location counter address contents 0 x 0080 0000 0 x 3 c 06 0040 0 x 0080 0004 0 x 34 c 6 0004 0 x 0080 0008 0 x 8 cc 7 0004 0 x 0080 000 c 0 x 012 a 0018 0 x 0080 0010 0 x 1000 fffd 53 (lui) (ori) (lw) (mult) (beq) These slides may be freely used, distributed, and incorporated into other works.

If the I field is a 2's complement value, it is 1111 1101 0000 0010 + 1 = 0000 0011 (+3) So, -3 is represented. The TAL code loop: lw mult beq next instr 54 -3 instructions These slides may be freely used, distributed, and incorporated into other works.

. data a 1: . word 3 a 2: . word 16: 4 a 3: . word 5. text __start: la $6, a 2 loop: lw $7, 4($6) mult $9, $10 b loop done 55 These slides may be freely used, distributed, and incorporated into other works.

done is a pseudoinstruction in TAL, ori $2, $0, 10 syscall 56 These slides may be freely used, distributed, and incorporated into other works.

ori $2, $0, 10 Rt 00010 Rs I 00000 op code 16 0011 01 ss ssst tttt ii . . . ii 0011 0100 0010 0000 1010 in hex 57 0 x 3402000 a These slides may be freely used, distributed, and incorporated into other works.

Add ori to the memory image, updating the location counter address contents 0 x 0080 0000 0 x 3 c 06 0040 0 x 0080 0004 0 x 34 c 6 0004 0 x 0080 0008 0 x 8 cc 7 0004 0 x 0080 000 c 0 x 012 a 0018 0 x 0080 0010 0 x 1000 fffd 0 x 0080 0014 0 x 3402 000 a 58 (lui) (ori) (lw) (mult) (beq) (ori) These slides may be freely used, distributed, and incorporated into other works.

syscall is the easiest instruction of all (all bits are defined) 0000 0000 1100 in hex 59 0 x 0000000 c These slides may be freely used, distributed, and incorporated into other works.

The complete memory image (of the code) address 0 x 0080 0 x 0080 60 0004 0008 000 c 0010 0014 0018 contents 0 x 3 c 06 0040 0 x 34 c 6 0004 0 x 8 cc 7 0004 0 x 012 a 0018 0 x 1000 fffd 0 x 3402 000 a 0 x 0000 000 c (lui) (ori) (lw) (mult) (beq) (ori) (syscall) These slides may be freely used, distributed, and incorporated into other works.

j. . . L 1: L 1 # more code here machine code for j 0000 10 6 bits 61 most of an address 26 bits These slides may be freely used, distributed, and incorporated into other works.

target address calculation at run time 0000 10 most of an address 26 bits PC 31. . 28 || 26 bits || 00 32 -bit target address 62 These slides may be freely used, distributed, and incorporated into other works.

●●● memory 1 ●●● 16 th of memory addresses to that 1 16 th of memory are all of the form XXXX? ? ● ● ● ? ? 00 fixed 63 26 bits These slides may be freely used, distributed, and incorporated into other works.

How big is 1 16 th of memory ? 228 bytes or 226 words is 64 M words Is it big enough? 64 These slides may be freely used, distributed, and incorporated into other works.

So, if we have address L 1: 31 28 26 bits of L 1 00 If L 1 is assigned address 0 xa 460 005 c, L 1 in binary: 1010 0100 0110 0000 0101 1100 Then machine code for j L 1 is 000010 0100 0110 0000 0101 11 65 in hex 0 x 09180017 These slides may be freely used, distributed, and incorporated into other works.

Summary, on the MIPS architecture: branch instructions use an offset from the current PC at execution time to calculate the target address for a taken branch jump instructions use part of an address together with implied other bits to form an address 66 These slides may be freely used, distributed, and incorporated into other works.