CMPUT 229 Fall 2003 Topic 7 Floating Point
- Slides: 46
CMPUT 229 - Fall 2003 Topic 7: Floating Point José Nelson Amaral CMPUT 229 - Computer Organization and Architecture I 1
Reading Assignment CMPUT 229 - Computer Organization and Architecture I 2
Representing Large and Small Numbers How would you represent a number such as 6. 023 1023 in binary? The range (1023) of this number is greater than the range of the 32 -bits representation that we have used for integers (231 2. 14 1010). However the precision (6023) of this number is quite small, and can be expressed in a small number of bits. The solution is to use a floating point representation. A floating point representation allocates some bits for the range of the value, some bits for precision, and one bit for the sign. CMPUT 229 - Computer Organization and Architecture I 3 From: Patt and Patel, pp. 32
Floating Point Representation Most standard floating point representation use: 1 bit for the sign (positive or negative) 8 bits for the range (exponent field) 23 bits for the precision (fraction field) 1 8 23 S exponent fraction CMPUT 229 - Computer Organization and Architecture I 4 From: Patt and Patel, pp. 33
Floating Point Representation (example) 1 8 23 S exponent fraction Thus the exponent is given by: 229 - Computer 1 10000001 CMPUT 10101000000000 Organization and Architecture I 5 From: Patt and Patel, pp. 34
Floating Point Representation (example) 1 8 23 S exponent fraction What is the decimal value of the following floating point number? 001111011000000000000 exponent = 64+32+16+8+2+1=(128 -8)+3=120+3=123 CMPUT 229 - Computer Organization and Architecture I 6 From: Patt and Patel, pp. 34
Floating Point Representation (example) 1 8 23 S exponent fraction What is the decimal value of the following floating point number? 01000001100101000000000 exponent =128+2+1=131 CMPUT 229 - Computer Organization and Architecture I 7 From: Patt and Patel, pp. 35
Floating Point Representation (example) 1 8 23 S exponent fraction What is the decimal value of the following floating point number? 1100000101000000000 exponent =128+2=130 CMPUT 229 - Computer Organization and Architecture I 8 From: Patt and Patel, pp. 35
Floating Point 1 8 23 S exponent fraction What is the largest number that can be represented in 32 bits floating point using the IEEE 754 format above? 0111111111111111 exponent =254 CMPUT 229 - Computer Organization and Architecture I 9 From: Patt and Patel, pp. 35
Floating Point 1 8 23 S exponent fraction What is the largest number that can be represented in 32 bits floating point using the IEEE 754 format above? 0111111111111111 exponent actual exponent =254 -127 = 127 CMPUT 229 - Computer Organization and Architecture I 10 From: Patt and Patel, pp. 35
Floating Point 1 8 23 S exponent fraction What is the smallest number (closest to zero) that can be represented in 32 bits floating point using the IEEE 754 format above? 00000000000000001 exponent actual exponent =0 -126 = -126 CMPUT 229 - Computer Organization and Architecture I 11 From: Patt and Patel, pp. 35
Special Floating Point Representations In the 8 -bit field of the exponent we can represent numbers from 0 to 255. We studied how to read numbers with exponents from 0 to 254. What is the value represented when the exponent is 255 (i. e. 11112)? An exponent equal 255 = 11112 in a floating point representation indicates a special value. When the exponent is equal 255 = 11112 and the fraction is 0, the value represented is infinity. When the exponent is equal 255 = 11112 and the fraction is non-zero, the value represented is Not a Number (Na. N). CMPUT 229 - Computer Organization and Architecture I Hen/Patt, pp. 301 12
Double Precision 32 -bit floating point representation is usually called single precision representation. A double precision floating point representation requires 64 bits. In double precision the following number of bits are used: 1 sign bit 11 bits for exponent 52 bits for fraction (also called significand) CMPUT 229 - Computer Organization and Architecture I 13
Floating Point Addition (Decimal) How do we perform the following addition? 9. 99910 101 + 1. 61010 10 -1 Step 1: Align decimal point of the number with smaller exponent (notice lost of precision) 9. 99910 101 + 0. 01610 101 Step 2: Add significands: 9. 99910 101 + 0. 01610 101 = 10. 01510 101 Step 3: Renormalize the result: 10. 015 101 = 1. 0015 102 Step 3: Round-off the result to the representation available: 1. 0015 102 = 1. 002 102 CMPUT 229 - Computer Organization and Architecture I Hen/Patt, pp. 281 14
Floating Point Addition (Example) Convert the numbers 0. 510 and -0. 437510 to floating point binary representation, and then perform the binary floating point addition of these numbers. Which number should have its significand adjusted? CMPUT 229 - Computer Organization and Architecture I Hen/Patt, pp. 283 15
Floating Point Multiplication (Decimal) Assume that we only can store four digits of the significand two digits of the exponent in a decimal floating point representation. How would you multiply 1. 11010 by 9. 20010 10 -5 in this representation? Step 1: Add the exponents: new exponent = 10 - 5 = 5 Step 2: Multiply the significands: Step 3: Normalize the product: 10. 21210 105 = 1. 021210 106 Step 4: Round-off the product: 1. 021210 106 = 1. 02110 106 1. 110 9. 200 0000 2220 9990 10. 212000 CMPUT 229 - Computer Organization and Architecture I Hen/Patt, pp. 286 16
MIPS Coprocessors CMPUT 229 - Computer Organization and Architecture I COPYRIGHT 1998 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED Hen/Patt, pp. A-50 17
Floating Point in MIPS Supports the IEEE 754 single-precision and double-precision formats. MIPS has a separate set of registers to store floating point operands: $f 0, $f 1, $f 2, . . . In single precision, each individual register $f 0, $f 1, $f 2, … contains one single precision (32 -bit) value. In double precision, each pair of registers $f 0 -$f 1, $f 2 -$f 3, … contains one double precision (64 -bit) value. CMPUT 229 - Computer Organization and Architecture I Hen/Patt, pp. 288 18
Floating Point in MIPS In order to load a value in a floating point register, MIPS offers the load word coprocessor, lwcz, instructions. Because the floating point coprocessor is the coprocessor number 1, the instruction is lwc 1. Similarly to store the value of a floating point register into memory, MIPS offers the store word coprocessor, swc 1. CMPUT 229 - Computer Organization and Architecture I Hen/Patt, pp. 288 19
Floating Point Instruction in MIPS What does the following assembly code do? lwc 1 add. s swc 1 $f 4, 4($sp) $f 6, 8($sp) $f 2, $f 4, $f 6 $f 2, 12($sp) Reads two floating point values from the stack, performs their addition and stores the result in the stack. CMPUT 229 - Computer Organization and Architecture I Hen/Patt, pp. 288 20
Floating Point (example) Parameter Passing Convention base of x[ ] $a 0 base of y[ ] $a 1 base of z[ ] $a 2 void mm ( double x[ ][ ], double y[ ][ ], double z[ ][ ]) { int i, j, k; Assumption for( i=0 ; i != 32 ; i=i+1 ) i $s 0 for( j=0 ; j != 32 ; j=j+1 ) j $s 1 { k $s 2 x[i][j] = 0. 0; for( k=0 ; k != 32 ; k=k+1 ) x[i][j] = x[i][j] + y[i][k] * z[k][j]; } } CMPUT 229 - Computer Organization and Architecture I Hen/Patt, pp. 294 21
i 0 i 32 return j 0 j 32 x[i][j] 0. 0 k 0 void mm ( double x[ ][ ], double y[ ][ ], double z[ ][ ]) { int i, j, k; for( i=0 ; i != 32 ; i=i+1 ) for( j=0 ; j != 32 ; j=j+1 ) { x[i][j] = 0; for( k=0 ; k != 32 ; k=k+1 ) x[i][j] = x[i][j] + y[i][k] * z[k][j] } } k 32 load x[i][j] load y[i][k] load z[k][j] d 1 y[i][k]*z[k][j] d 1 = d 1 + x[i][j] d 1 i i+1 j j+1 k k+1 CMPUT 229 - Computer Organization and Architecture I Do we need to load and store x[i][j] in every iteration of loop k? 22
i 0 i 32 return j 0 j 32 d 2 0. 0 k 0 void mm ( double x[ ][ ], double y[ ][ ], double z[ ][ ]) { int i, j, k; for( i=0 ; i != 32 ; i=i+1 ) for( j=0 ; j != 32 ; j=j+1 ) { x[i][j] = 0; for( k=0 ; k != 32 ; k=k+1 ) x[i][j] = x[i][j] + y[i][k] * z[k][j] } } k 32 i i+1 load y[i][k] load z[k][j] d 1 y[i][k]*z[k][j] d 2+ d 1 k k+1 x[i][j] d 2 j j+1 CMPUT 229 - Computer Organization and Architecture I Parameter Passing Convention base of x[ ] $a 0 base of y[ ] $a 1 base of z[ ] $a 2 Assumption i $s 0 j $s 1 k $s 2 23
i 0 i 32 return j 0 j 32 d 2 0. 0 k 32 MIPS assembly: li $t 1, 32 # t 1 32 li $s 0, 0 #i 0 L 1: beq $s 0, $t 1, D 1 li $s 1, 0 #j 0 L 2: beq $s 1, $t 1, D 2 $f 4 0. 0 li $s 2, 0 #k 0 L 3: beq $s 2, $t 1, D 3 <loop body> addiu $s 2, 1 # k k+1 j L 3 D 3: x[i][j] $f 4 addiu $s 1, 1 # j j+1 j L 2 D 2: addiu $s 0, 1 # i i+1 j L 1 D 1: i i+1 load y[i][k] load z[k][j] d 1 y[i][k]*z[k][j] d 2+ d 1 k k+1 x[i][j] d 2 j j+1 CMPUT 229 - Computer Organization and Architecture I Parameter Passing Convention base of x[ ] $a 0 base of y[ ] $a 1 base of z[ ] $a 2 Assumption i $s 0 j $s 1 k $s 2 24
i 0 void mm ( double x[ ][ ], double y[ ][ ], double z[ ][ ]) { int i, j, k; for( i=0 ; i != 32 ; i=i+1 ) for( j=0 ; j != 32 ; j=j+1 ) { x[i][j] = 0; for( k=0 ; k != 32 ; k=k+1 ) x[i][j] = x[i][j] + y[i][k] * z[k][j] } } j 0 d 2 0. 0 k 0 load y[i][k] load z[k][j] d 1 y[i][k]*z[k][j] d 2+ d 1 k k+1 k 32 x[i][j] d 2 j j+1 j 32 i i+1 CMPUT 229 - Computer Organization and Architecture Ii 32 return 25
i 0 j 0 Parameter Passing Convention base of x[ ][ ] $a 0 base of y[ ][ ] $a 1 base of z[ ][ ] $a 2 d 2 0. 0 k 0 Assumption i $s 0 j $s 1 k $s 2 load y[i][k] load z[k][j] d 1 y[i][k]*z[k][j] d 2+ d 1 MIPS assembly: li $t 1, 32 li $s 0, 0 L 1: li $s 1, 0 L 2: $f 4 0. 0 li $s 2, 0 L 3: <loop body> addiu $s 2, 1 bne $s 2, $t 1, L 3 x[i][j] $f 4 addiu $s 1, 1 bne $s 1, $t 1, L 2 addiu $s 0, 1 bne $s 0, $t 1, L 1 # t 1 32 #i 0 #j 0 #k 0 # k k+1 # j j+1 # i i+1 k k+1 k 32 x[i][j] d 2 j j+1 j 32 i i+1 CMPUT 229 - Computer Organization and Architecture Ii 32 return 26
The loop body load y[i][k] load z[k][j] d 1 y[i][k]*z[k][j] d 2+ d 1 How do we load the y[i][k] into a floating point register? First we have to consider how a 2 -dimensional matrix of doubles is stored in memory Base of y[ ][ ] Parameter Passing Convention base of x[ ][ ] $a 0 base of y[ ][ ] $a 1 base of z[ ][ ] $a 2 Base of y[ ][ ]+8 32 Assumption i $s 0 j $s 1 k $s 2 Base of y[ ][ ]+8 y[0][0] y[0][1] y[0][2] y[0][31] y[1][0] y[1][1] y[1][2] y[1][31] y[31][0] y[31][1] y[31][2] y[31] In general, the address of y[i][k] is given by: add(y[i][k])= base of y[ ][ ] + ( i 32 + k ) 8 CMPUT 229 - Computer Organization and Architecture I 27
The loop body load y[i][k] load z[k][j] d 1 y[i][k]*z[k][j] d 2+ d 1 Parameter Passing Convention base of x[ ][ ] $a 0 base of y[ ][ ] $a 1 base of z[ ][ ] $a 2 Assumption i $s 0 j $s 1 k $s 2 In general, the address of y[i][k] is given by: add(y[i][k])= base of y[ ][ ] + ( i 32 + k ) 8 MIPS assembly for load y[i][k]: L 3: sll $t 2, $s 0, 5 addu $t 2, $s 2 sll $t 2, 3 addu $t 2, $a 1, $t 2 l. d $f 16, 0($t 2) # $t 2 32 i + k # $t 2 (32 i + k) 8 # $t 2 Addr(y[i][k]) # $f 16 y[i][k] Write the code to load z[k][j] in $f 18. MIPS assembly for load z[k][j]: sll $t 2, $s 2, 5 addu $t 2, $s 1 sll $t 2, 3 addu $t 2, $a 2, $t 2 l. d $f 18, 0($t 2) CMPUT 229 - Computer Organization and Architecture I # $t 2 32 k + j # $t 2 (32 k + j) 8 # $t 2 Addr(z[k][j]) # $f 16 z[k][j] 28
The loop body (cont. ) load y[i][k] load z[k][j] d 1 y[i][k]*z[k][j] d 2+ d 1 Parameter Passing Convention base of x[ ][ ] $a 0 base of y[ ][ ] $a 1 base of z[ ][ ] $a 2 Once we have loaded y[i][k] into $f 16 and z[k][j] into $f 18, we can proceed to peform the multiply and the add: MIPS assembly for multiply and add: mul. d $f 16, $f 18, $f 16 add. d $f 4, $f 16 # $f 16 y[i][k] z[k][j] Assumption i $s 0 j $s 1 k $s 2 CMPUT 229 - Computer Organization and Architecture I 29
Initializing and Storing $f 4 How can we initialize $f 4? MIPS assembly to initialize $f 4: mtc 1 $zero, $f 2 mtc 1 $zero, $f 3 Warning: In your textbook, page A-69, mtcz is specified as follows: MIPS assembly: li $t 1, 32 li $s 0, 0 L 1: li $s 1, 0 L 2: $f 4 0. 0 li $s 2, 0 L 3: <loop body> addiu $s 2, 1 bne $s 2, $t 1, L 3 x[i][j] $f 4 addiu $s 1, 1 bne $s 1, $t 1, L 2 addiu $s 0, 1 bne $s 0, $t 1, L 1 Move to coprocessor z: mtcz rd, rt Move CPU register rt to coprocessor z’s register rd. CMPUT 229 - Computer Organization and Architecture I # t 1 32 #i 0 #j 0 #k 0 # k k+1 # j j+1 # i i+1 Parameter Passing Convention base of x[ ][ ] $a 0 base of y[ ][ ] $a 1 base of z[ ][ ] $a 2 Assumption i $s 0 j $s 1 30 k $s 2
Initializing and Storing $f 4 How can we initialize $f 4? MIPS assembly to initialize $f 4: mtc 1 $zero, $f 4 mtc 1 $zero, $f 5 How can we store $f 4 in x[i][j]? MIPS assembly to store $f 4 in x[i][j]: L 3: sll $t 2, $s 0, 5 addu $t 2, $s 1 sll $t 2, 3 addu $t 2, $a 0, $t 2 swc 1 $f 4, 0($t 2) swc 1 $f 5, 4($t 2) MIPS assembly: li $t 1, 32 li $s 0, 0 L 1: li $s 1, 0 L 2: $f 4 0. 0 li $s 2, 0 L 3: <loop body> addiu $s 2, 1 bne $s 2, $t 1, L 3 x[i][j] $f 4 addiu $s 1, 1 bne $s 1, $t 1, L 2 addiu $s 0, 1 bne $s 0, $t 1, L 1 # $t 2 32 i + j # $t 2 (32 i + j) 8 # $t 2 Addr(x[i][j]) # x[i][j] $f 4 CMPUT 229 - Computer Organization and Architecture I # t 1 32 #i 0 #j 0 #k 0 # k k+1 # j j+1 # i i+1 Parameter Passing Convention base of x[ ][ ] $a 0 base of y[ ][ ] $a 1 base of z[ ][ ] $a 2 Assumption i $s 0 j $s 1 31 k $s 2
Parameter Passing Convention base of x[ ][ ] $a 0 base of y[ ][ ] $a 1 base of z[ ][ ] $a 2 Assumption i $s 0 j $s 1 k $s 2 MIPS assembly: li $t 1, 32 li $s 0, 0 L 1: li $s 1, 0 L 2: mtc 1 $zero, $f 4 mtc 1 $zero, $f 5 li $s 2, 0 L 3: sll $t 2, $s 0, 5 addu $t 2, $s 2 sll $t 2, 3 addu $t 2, $a 1, $t 2 l. d $f 16, 0($t 2) sll $t 2, $s 2, 5 addu $t 2, $s 1 sll $t 2, 3 addu $t 2, $a 2, $t 2 l. d $f 18, 0($t 2) mul. d $f 16, $f 18, $f 16 add. d $f 4, $f 16 addiu $s 2, 1 bne $s 2, $t 1, L 3 sll $t 2, $s 0, 5 addu $t 2, $s 1 sll $t 2, 3 addu $t 2, $a 0, $t 2 swc 1 $f 4, 0($t 2) swc 1 $f 5, 4($t 2) addiu $s 1, 1 bne $s 1, $t 1, L 2 CMPUT 229 addiu - Computer$s 0, 1 Organization and Architecture bne $s 0, I $t 1, L 1 # t 1 32 #i 0 #j 0 #k 0 # $t 2 32 i + k # $t 2 (32 i + k) 8 # $t 2 Addr(y[i][k]) # $f 16 y[i][k] # $t 2 32 k # $t 2 32 i + j # $t 2 (32 k + j) 8 # $t 2 Addr(z[k][j]) # $f 16 z[k][j] # $f 16 y[i][k] z[k][j] # k k+1 # $t 2 32 i + j # $t 2 (32 i + j) 8 # $t 2 Addr(x[i][j]) # x[i][j] $f 4 # j j+1 # i i+1 32
load y[i][k] in $f 16 load z[k][j] in $f 16 store $f 4 in x[i][j] Parameter Passing Convention base of x[ ][ ] $a 0 base of y[ ][ ] $a 1 base of z[ ][ ] $a 2 Assumption i $s 0 j $s 1 k $s 2 MIPS assembly: li $t 1, 32 li $s 0, 0 L 1: li $s 1, 0 L 2: mtc 1 $zero, $f 4 mtc 1 $zero, $f 5 li $s 2, 0 L 3: sll $t 2, $s 0, 5 addu $t 2, $s 2 sll $t 2, 3 addu $t 2, $a 1, $t 2 l. d $f 16, 0($t 2) sll $t 2, $s 2, 5 addu $t 2, $s 1 sll $t 2, 3 addu $t 2, $a 2, $t 2 l. d $f 18, 0($t 2) mul. d $f 16, $f 18, $f 16 add. d $f 4, $f 16 addiu $s 2, 1 bne $s 2, $t 1, L 3 sll $t 2, $s 0, 5 addu $t 2, $s 1 sll $t 2, 3 addu $t 2, $a 0, $t 2 swc 1 $f 4, 0($t 2) swc 1 $f 5, 4($t 2) addiu $s 1, 1 bne $s 1, $t 1, L 2 CMPUT 229 addiu - Computer$s 0, 1 Organization and Architecture bne $s 0, I $t 1, L 1 # t 1 32 #i 0 #j 0 #k 0 # $t 2 32 i + k # $t 2 (32 i + k) 8 # $t 2 Addr(y[i][k]) # $f 16 y[i][k] # $t 2 32 k # $t 2 32 i + j # $t 2 (32 k + j) 8 # $t 2 Addr(z[k][j]) # $f 16 z[k][j] # $f 16 y[i][k] z[k][j] # k k+1 # $t 2 32 i + j # $t 2 (32 i + j) 8 # $t 2 Addr(x[i][j]) # x[i][j] $f 4 # j j+1 # i i+1 33
MIPS assembly: li $t 1, 32 Write the code to save/restore li $s 0, 0 L 1: li $s 1, 0 registers that need to L 2: mtc 1 $zero, $f 4 be saved in the stack. mtc 1 $zero, $f 5 li $s 2, 0 L 3: sll $t 2, $s 0, 5 addu $t 2, $s 2 sll $t 2, 3 addu $t 2, $a 1, $t 2 l. d $f 16, 0($t 2) sll $t 2, $s 2, 5 addu $t 2, $s 1 sll $t 2, 3 addu $t 2, $a 2, $t 2 l. d $f 18, 0($t 2) mul. d $f 16, $f 18, $f 16 add. d $f 4, $f 16 addiu $s 2, 1 bne $s 2, $t 1, L 3 sll $t 2, $s 0, 5 addu $t 2, $s 1 sll $t 2, 3 addu $t 2, $a 0, $t 2 swc 1 $f 4, 0($t 2) swc 1 $f 5, 4($t 2) addiu $s 1, 1 bne $s 1, $t 1, L 2 CMPUT 229 addiu - Computer$s 0, 1 Organization and Architecture bne $s 0, I $t 1, L 1 # t 1 32 #i 0 #j 0 #k 0 # $t 2 32 i + k # $t 2 (32 i + k) 8 # $t 2 Addr(y[i][k]) # $f 16 y[i][k] # $t 2 32 k # $t 2 32 i + j # $t 2 (32 k + j) 8 # $t 2 Addr(z[k][j]) # $f 16 z[k][j] # $f 16 y[i][k] z[k][j] # k k+1 # $t 2 32 i + j # $t 2 (32 i + j) 8 # $t 2 Addr(x[i][j]) # x[i][j] $f 4 # j j+1 # i i+1 34
MIPS assembly: li $t 1, 32 Write the code to save/restore li $s 0, 0 L 1: li $s 1, 0 registers that need to L 2: mtc 1 $zero, $f 4 be saved in the stack. mtc 1 $zero, $f 5 li $s 2, 0 L 3: sll $t 2, $s 0, 5 MIPS foo stack saving assembly: addu $t 2, $s 2 addi $sp, -36 sll $t 2, 3 sw $s 0, 32($sp) addu $t 2, $a 1, $t 2 sw $s 1, 28($sp) l. d $f 16, 0($t 2) sw $s 2, 24($sp) sll $t 2, $s 2, 5 swc 1 $f 4, 20($sp) addu $t 2, $s 1 swc 1 $f 5, 16($sp) sll $t 2, 3 swc 1 $f 16, 12($sp) addu $t 2, $a 2, $t 2 swc 1 $f 17, 8($sp) l. d $f 18, 0($t 2) swc 1 $f 18, 4($sp) mul. d $f 16, $f 18, $f 16 swc 1 $f 19, 0($sp) add. d $f 4, $f 16 addiu $s 2, 1 MIPS foo stack restoring assembly: bne $s 2, $t 1, L 3 lwc 1 $f 19, 0($sp) sll $t 2, $s 0, 5 lwc 1 $f 18, 4($sp) addu $t 2, $s 1 lwc 1 $f 17, 8($sp) sll $t 2, 3 lwc 1 $f 16, 12($sp) addu $t 2, $a 0, $t 2 lwc 1 $f 5, 16($sp) swc 1 $f 4, 0($t 2) lwc 1 $f 4, 20($sp) swc 1 $f 5, 4($t 2) lw $s 2, 24($sp) addiu $s 1, 1 lw $s 1, 28($sp) bne $s 1, $t 1, L 2 lw $s 0, 32($sp) CMPUT 229 addiu - Computer$s 0, 1 addi $sp, 36 Organization and Architecture bne $s 0, I $t 1, L 1 # t 1 32 #i 0 #j 0 #k 0 # $t 2 32 i + k # $t 2 (32 i + k) 8 # $t 2 Addr(y[i][k]) # $f 16 y[i][k] # $t 2 32 k # $t 2 32 i + j # $t 2 (32 k + j) 8 # $t 2 Addr(z[k][j]) # $f 16 z[k][j] # $f 16 y[i][k] z[k][j] # k k+1 # $t 2 32 i + j # $t 2 (32 i + j) 8 # $t 2 Addr(x[i][j]) # x[i][j] $f 4 # j j+1 # i i+1 35
MIPS assembly: li $t 1, 32 li $s 0, 0 L 1: li $s 1, 0 L 2: mtc 1 $zero, $f 4 mtc 1 $zero, $f 5 li $s 2, 0 L 3: sll $t 2, $s 0, 5 addu $t 2, $s 2 Suppose that we classify the sll $t 2, 3 addu $t 2, $a 1, $t 2 instructions of this program into: l. d $f 16, 0($t 2) sll $t 2, $s 2, 5 integer logic and arithmetic addu $t 2, $s 1 32 -bit load/stores sll $t 2, 3 addu $t 2, $a 2, $t 2 conditional branchs l. d $f 18, 0($t 2) FP additions mul. d $f 16, $f 18, $f 16 FP multiplications add. d $f 4, $f 16 addiu $s 2, 1 move to/from coprocessor bne $s 2, $t 1, L 3 sll $t 2, $s 0, 5 addu $t 2, $s 1 How many instructions of sll $t 2, 3 each class are executed? addu $t 2, $a 0, $t 2 swc 1 $f 4, 0($t 2) swc 1 $f 5, 4($t 2) addiu $s 1, 1 bne $s 1, $t 1, L 2 CMPUT 229 addiu - Computer$s 0, 1 Organization and Architecture bne $s 0, I $t 1, L 1 # t 1 32 #i 0 #j 0 #k 0 # $t 2 32 i + k # $t 2 (32 i + k) 8 # $t 2 Addr(y[i][k]) # $f 16 y[i][k] # $t 2 32 k # $t 2 32 i + j # $t 2 (32 k + j) 8 # $t 2 Addr(z[k][j]) # $f 16 z[k][j] # $f 16 y[i][k] z[k][j] # k k+1 # $t 2 32 i + j # $t 2 (32 i + j) 8 # $t 2 Addr(x[i][j]) # x[i][j] $f 4 # j j+1 # i i+1 36
First we will have to examine the pseudoinstructions. For instance li $t 1, 32 is translated to ori $t 1, $zero, 32 And l. d $f 16, 0($t 2) is translated to lwc 1 $f 18, 0($t 2) lwc 1 $f 19, 4($t 2) MIPS assembly: li $t 1, 32 li $s 0, 0 L 1: li $s 1, 0 L 2: mtc 1 $zero, $f 4 mtc 1 $zero, $f 5 li $s 2, 0 L 3: sll $t 2, $s 0, 5 addu $t 2, $s 2 sll $t 2, 3 addu $t 2, $a 1, $t 2 l. d $f 16, 0($t 2) sll $t 2, $s 2, 5 addu $t 2, $s 1 sll $t 2, 3 addu $t 2, $a 2, $t 2 l. d $f 18, 0($t 2) mul. d $f 16, $f 18, $f 16 add. d $f 4, $f 16 addiu $s 2, 1 bne $s 2, $t 1, L 3 sll $t 2, $s 0, 5 addu $t 2, $s 1 sll $t 2, 3 addu $t 2, $a 0, $t 2 swc 1 $f 4, 0($t 2) swc 1 $f 5, 4($t 2) addiu $s 1, 1 bne $s 1, $t 1, L 2 CMPUT 229 addiu - Computer$s 0, 1 Organization and Architecture bne $s 0, I $t 1, L 1 # t 1 32 #i 0 #j 0 #k 0 # $t 2 32 i + k # $t 2 (32 i + k) 8 # $t 2 Addr(y[i][k]) # $f 16 y[i][k] # $t 2 32 k # $t 2 32 i + j # $t 2 (32 k + j) 8 # $t 2 Addr(z[k][j]) # $f 18 z[k][j] # $f 16 y[i][k] z[k][j] # k k+1 # $t 2 32 i + j # $t 2 (32 i + j) 8 # $t 2 Addr(x[i][j]) # x[i][j] $f 4 # j j+1 # i i+1 37
How many times each loop is executed? out = 1 L 1 = 32 times L 2 = 32 32 times L 3 = 32 32 32 times MIPS assembly: li $t 1, 32 li $s 0, 0 L 1: li $s 1, 0 L 2: mtc 1 $zero, $f 4 mtc 1 $zero, $f 5 li $s 2, 0 L 3: sll $t 2, $s 0, 5 addu $t 2, $s 2 sll $t 2, 3 addu $t 2, $a 1, $t 2 l. d $f 16, 0($t 2) sll $t 2, $s 2, 5 addu $t 2, $s 1 sll $t 2, 3 addu $t 2, $a 2, $t 2 l. d $f 18, 0($t 2) mul. d $f 16, $f 18, $f 16 add. d $f 4, $f 16 addiu $s 2, 1 bne $s 2, $t 1, L 3 sll $t 2, $s 0, 5 addu $t 2, $s 1 sll $t 2, 3 addu $t 2, $a 0, $t 2 swc 1 $f 4, 0($t 2) swc 1 $f 5, 4($t 2) addiu $s 1, 1 $s 1, $t 1, L 2 CMPUT bne 229 - Computer addiu $s 0, Organization and Architecture I 1 bne $s 0, $t 1, L 1 # t 1 32 #i 0 #j 0 #k 0 # $t 2 32 i + k # $t 2 (32 i + k) 8 # $t 2 Addr(y[i][k]) # $f 16 y[i][k] # $t 2 32 k # $t 2 32 i + j # $t 2 (32 k + j) 8 # $t 2 Addr(z[k][j]) # $f 16 z[k][j] # $f 16 y[i][k] z[k][j] # k k+1 # $t 2 32 i + j # $t 2 (32 i + j) 8 # $t 2 Addr(x[i][j]) # x[i][j] $f 4 # j j+1 # i i+1 38
MIPS assembly: li L 1 = 32 times li L 2 = 32 32 times L 1: li L 3 = 32 32 32 times L 2: mtc 1 li L 3: sll addu sll Complete the table below with the number of addu instructions of each type executed in each l. d region of the program. sll addu l. d mul. d addiu bne sll addu swc 1 addiu bne CMPUT 229 - Computer addiu Organization and Architecture I bne $t 1, 32 $s 0, 0 $s 1, 0 $zero, $f 4 $zero, $f 5 $s 2, 0 $t 2, $s 0, 5 $t 2, $s 2 $t 2, 3 $t 2, $a 1, $t 2 $f 16, 0($t 2) $t 2, $s 2, 5 $t 2, $s 1 $t 2, 3 $t 2, $a 2, $t 2 $f 18, 0($t 2) $f 16, $f 18, $f 16 $f 4, $f 16 $s 2, 1 $s 2, $t 1, L 3 $t 2, $s 0, 5 $t 2, $s 1 $t 2, 3 $t 2, $a 0, $t 2 $f 4, 0($t 2) $f 5, 4($t 2) $s 1, 1 $s 1, $t 1, L 2 $s 0, 139 $s 0, $t 1, L 1
MIPS assembly: li L 1 = 32 times li L 2 = 32 32 times L 1: li L 3 = 32 32 32 times L 2: mtc 1 li L 3: sll addu sll Complete the table below with the number of addu instructions of each type executed in each l. d region of the program. sll addu l. d mul. d addiu bne sll addu swc 1 addiu bne CMPUT 229 - Computer addiu Organization and Architecture I bne $t 1, 32 $s 0, 0 $s 1, 0 $zero, $f 4 $zero, $f 5 $s 2, 0 $t 2, $s 0, 5 $t 2, $s 2 $t 2, 3 $t 2, $a 1, $t 2 $f 16, 0($t 2) $t 2, $s 2, 5 $t 2, $s 1 $t 2, 3 $t 2, $a 2, $t 2 $f 18, 0($t 2) $f 16, $f 18, $f 16 $f 4, $f 16 $s 2, 1 $s 2, $t 1, L 3 $t 2, $s 0, 5 $t 2, $s 1 $t 2, 3 $t 2, $a 0, $t 2 $f 4, 0($t 2) $f 5, 4($t 2) $s 1, 1 $s 1, $t 1, L 2 $s 0, 140 $s 0, $t 1, L 1
MIPS assembly: li L 1 = 32 times li L 2 = 32 32 times = 1024 times L 1: li L 3 = 32 32 32 times = 32768 times L 2: mtc 1 li L 3: sll addu sll Complete the table below with the number of addu instructions of each type executed in each l. d region of the program. sll addu l. d mul. d addiu bne sll addu swc 1 addiu bne CMPUT 229 - Computer addiu Organization and Architecture I bne $t 1, 32 $s 0, 0 $s 1, 0 $zero, $f 4 $zero, $f 5 $s 2, 0 $t 2, $s 0, 5 $t 2, $s 2 $t 2, 3 $t 2, $a 1, $t 2 $f 16, 0($t 2) $t 2, $s 2, 5 $t 2, $s 1 $t 2, 3 $t 2, $a 2, $t 2 $f 18, 0($t 2) $f 16, $f 18, $f 16 $f 4, $f 16 $s 2, 1 $s 2, $t 1, L 3 $t 2, $s 0, 5 $t 2, $s 1 $t 2, 3 $t 2, $a 0, $t 2 $f 4, 0($t 2) $f 5, 4($t 2) $s 1, 1 $s 1, $t 1, L 2 $s 0, 141 $s 0, $t 1, L 1
Computing CPI If you know that each of the following types of instructions take the indicated number of clock cycles to execute. How would you compute the CPI for this machine? CMPUT 229 - Computer Organization and Architecture I 42
Computing CPI (cont. ) CMPUT 229 - Computer Organization and Architecture I 43
Computing Execution Time If the machine that we are using has a processor that operates at 1. 3 GHz, how long does it take to execute foo( )? CMPUT 229 - Computer Organization and Architecture I 44
In preparation to the midterm. . . Write a code segment that reads a byte B from the address 0 x 8400 0040 and: a) writes 0 x 0000 00 FF in address 0 x 8400 0044 if the bit 5 of B is 1; b) writes 0 x. FFFF FF 00 in address 0 x 8400 0044 otherwise CMPUT 229 - Computer Organization and Architecture I 45
In preparation to the midterm. . . Write a minimum instruction sequence that inverts all the bits in the exponent field of the number stored in register $f 2. CMPUT 229 - Computer Organization and Architecture I 46
- Cmput 229
- Cmput 229
- Cmput 229
- Fixed point representation vs floating point
- 815-229-7246
- Iz 229
- Four seasons korean movie
- Floating point division algorithm in computer architecture
- Range of signed number
- Floating point number representation
- Mult mips instruction
- Bit4bytes
- Ieee
- Floating point representation
- Ieee denormalized numbers
- Floating point number meaning
- Floating point representation
- What are floating point numbers
- Shift and add multiplication
- Eascii
- Parts of a floating point number
- Instruction issue algorithm of pentium processor
- Xkcd floating point
- Dfa for floating point numbers
- Aritmathic
- Eecs 370
- Express (32)10 in the revised 14-bit floating-point model.
- Floating point form
- Floating point puzzles
- Floating point puzzles
- Floating point adder vhdl
- Integer
- Floating point representation definition
- Explain floating point arithmetic operations with example
- Example of a clincher sentence
- Tapic about internet
- Cmput 428
- Cmput 367
- Cmput 382
- Cmput 301
- Cmput
- Cmput 101
- Cmput 267
- Cmput 365
- Cmput 603
- Cmput 382
- Cmput 412