Computer Arithmetic Adder Performance Multiply Shift floating point

  • Slides: 48
Download presentation
Computer Arithmetic Adder Performance, Multiply, Shift & floating point ( App. C 5, C

Computer Arithmetic Adder Performance, Multiply, Shift & floating point ( App. C 5, C 6 4 th ed. ) 32 A c ovf 11/1/20 20 32 ALU S B m 4 32 arithmetic. 1

1 -bit adder Review (Appendix B. 5, B. 6) Cin a Sum 1 unit

1 -bit adder Review (Appendix B. 5, B. 6) Cin a Sum 1 unit of delay from Cin to sum b sum Cin Co 2 units of delay from A/B to sum A B Carry out 2 gate delays Sum = a!bc! + ab!c! + a!b!c+abc =a 11/1/20 20 b c = XOR Carryout = a!bc + ab!c + abc! + abc arithmetic. 2

Binvert 1 -bit ALU: AND, OR, a+b! Operation Carry. In a 0 1 b

Binvert 1 -bit ALU: AND, OR, a+b! Operation Carry. In a 0 1 b Result 0 2 1 Less 3 a. Carry. Out ALU Delays Result = 1 gate delay Binvert Most significant bit From a to result = 2 Operation Carry. In a Form b to Result = 2 (ignore b invert) 0 1 b Result 0 2 1 Less 3 Set 11/1/20 20 Overflow detection b. Overflow arithmetic. 3

Final 32 -bit ALU, including zero detect Bnegate a 0 b 0 ALU 0

Final 32 -bit ALU, including zero detect Bnegate a 0 b 0 ALU 0 Less Result 0 Carry. Out a 1 b 1 0 Carry. In ALU 1 Less Carry. Out Result 1 a 2 b 2 0 Carry. In ALU 2 Less Carry. Out Result 2 a 31 b 31 0 11/1/20 20 Operation Carry. In ALU 31 Less Zero Result 31 Set Overflow arithmetic. 4

Behavioral Representation: verilog, RTL FYI) module ALU(A, B, m, S, c, ovf); input [0:

Behavioral Representation: verilog, RTL FYI) module ALU(A, B, m, S, c, ovf); input [0: 31] A, B; input [0: 3] m; output [0: 31] S; output c, ovf; reg [0: 31] S; reg c, ovf; always @(A, B, m) begin case (m) 0: S = A + B; . . . endmodule 11/1/20 20 • Code written, simulated & verified • translated into hardware (mapped) • How complex digital design is done arithmetic. 5

Overflow ? ? - 4 -bit example Decimal 0 1 2 3 4 5

Overflow ? ? - 4 -bit example Decimal 0 1 2 3 4 5 6 7 Binary 0000 0001 0010 0011 0100 0101 0110 0111 • Examples: 7 + 3 = 10 but. . . • - 4 - 5 = - 9 but. . . 0 + 11/1/20 20 1 1 1 0 0 1 1 7 3 1 0 – 6 Decimal 0 -1 -2 -3 -4 -5 -6 -7 -8 2’s Complement 0000 1111 1110 1101 1100 1011 1010 1001 1000 1 + 1 1 0 0 1 1 – 4 – 5 0 1 1 1 7 arithmetic. 6

Overflow Detection • • Overflow: arithmetic result too large (or too small) to represent

Overflow Detection • • Overflow: arithmetic result too large (or too small) to represent properly – Example: - 8 4 -bit binary number 7 When adding operands with different signs, overflow cannot occur! Overflow occurs when adding: – 2 positive numbers and sum is negative – 2 negative numbers and the sum is positive On your own: Prove you can detect overflow by: – Carry into MSB Carry out of MSB 0 + 11/1/20 20 1 1 1 0 0 1 1 7 3 1 0 – 6 + 0 1 1 0 0 1 1 – 4 – 5 0 1 1 1 7 arithmetic. 7

Overflow Detection Logic • Carry into MSB Carry out of MSB – For a

Overflow Detection Logic • Carry into MSB Carry out of MSB – For a N-bit ALU: Overflow = Carry. In[N - 1] XOR Carry. Out[N - 1] Carry. In 0 A 0 B 0 A 1 B 1 A 2 B 2 1 -bit Result 0 ALU Carry. In 1 Carry. Out 0 1 -bit Result 1 ALU Carry. In 2 Carry. Out 1 1 -bit ALU B 3 1 -bit ALU Y X XOR Y 0 0 1 1 0 1 0 1 1 0 Result 2 Carry. In 3 A 3 X Overflow Result 3 Carry. Out 3 11/1/20 20 arithmetic. 8

MIPS ALU requirements • Add, Add. U, Sub. U, Add. IU – => 2’s

MIPS ALU requirements • Add, Add. U, Sub. U, Add. IU – => 2’s complement adder/sub with overflow detection • And, Or, And. I, Or. I, Xori, Nor – => Logical AND, logical OR, XOR, nor • SLTI, SLTIU (set less than) – => 2’s complement adder with inverter, check sign bit of result • ALU must support these ops 11/1/20 20 arithmetic. 9

Ripple Adder Performance? • Critical Path of n-bit Rippled-carry adder is n*CP Carry. In

Ripple Adder Performance? • Critical Path of n-bit Rippled-carry adder is n*CP Carry. In 0 A 0 B 0 A 1 B 1 A 2 B 2 A 3 B 3 1 -bit Result 0 ALU Carry. In 1 Carry. Out 0 1 -bit Result 1 ALU Carry. In 2 Carry. Out 1 1 -bit Result 2 ALU Carry. In 3 Carry. Out 2 1 -bit ALU Carry. Out 3 11/1/20 20 Result 3 Cin A B 1 unit of delay from Cin to sum 2 units of delay from A/B to sum Very slow: Must improve Assume t = carry delay / bit 32 - bit ALU needs 32 * t units of delay 64 -bit ALU needs 64 * t units of delay arithmetic. 10

Fast Add - Carry Select - review • 4 -bit Carry Select Adder •

Fast Add - Carry Select - review • 4 -bit Carry Select Adder • Uses 2 4 -bit ripple adder • one adder assumes Cin = 0 • 2 nd adder assumes Cin = 1 • Cin selects Sum & Cout 11/1/20 20 arithmetic. 11

Fast Addition : Carry Lookahead • Carry Inputs can be precomputed by logic c

Fast Addition : Carry Lookahead • Carry Inputs can be precomputed by logic c 1 = g 0 + c 0 p 0 = a 0 b 0 + c 0 (a 0 + b 0) p 0 = a 0 + b 0 g 0 = a 0 b 0 1 unit delay each p, g c 2 = g 1 + p 1 c 1 = g 1 + p 1 g 0 + p 1 p 0 c 0 3 units of delay = a 1 b 1 + c 1 (a 1 + b 1) p 1 = a 1 + b 1 g 1 = a 1 b 1 1 unit delay c 3 = g 2 + p 2 g 1 + p 2 p 1 g 0 + p 2 p 1 p 0 c 0 3 units of delay c 4 = g 3 + p 3 g 2 + p 3 p 2 g 1 + p 3 p 2 p 1 g 0 + p 3 p 2 p 1 p 0 c 0 3 units of delay C 4= func( a 3, b 3, a 2, b 2, a 1, b 1, a 0, b 0, c 0) 11/1/20 20 arithmetic. 12

Fast Addition: Carry Look Ahead – 4 bits C 0 = Cin a 0

Fast Addition: Carry Look Ahead – 4 bits C 0 = Cin a 0 g p b 0 2 S 3 a 1 g p b 2 S b 3 “kill” “propagate” “generate” g = a and b 1 delay p = a or b g p 3 units of delay for c 1, c 2, c 3, (c 4) 4 units of delay for S 1, S 2, S 3 S 4 c 3 = g 2 + g 1 p 2 + g 0 p 1 p 2 + c 0 p 1 p 2 G 0=g 3 + p 3 g 2 + p 3 p 2 g 1 + p 3 p 2 p 1 g 0 P 0 = p 3 p 2 p 1 p 0 C 4 =. . . 11/1/20 20 C-out 0 C-in 1 c 1 = g 0 + c 0 p 0 S 4 3 a 3 4 B 0 1 c 2 = g 1 + g 0 p 1 + c 0 p 1 3 a 2 A 0 0 1 1 3 units of delay for G 0 arithmetic. 13

Carry Lookahead – 2 nd level – 16 bits Add 2 nd level abstraction

Carry Lookahead – 2 nd level – 16 bits Add 2 nd level abstraction for more practical 4 -bit units Each Pi, Gi handles 4 bits at a time, 0 -3, 4 -7, 8 -11, . . ) P 0 = p 3 p 2 p 1 p 0 ; G 0 = g 3 + p 3 g 2 + p 3 p 2 g 1 + p 3 p 2 p 1 g 0 P 1 = p 7 p 6 p 5 p 4 ; G 1 = g 7 + p 7 g 6 + p 7 p 6 g 5 + p 7 p 6 p 5 g 4 P 2 = p 11 p 10 p 9 p 8 ; G 2 =g 11 + p 11 g 10 + p 11 p 10 g 9 + p 11 p 10 p 9 g 8 P 3 = p 15 p 14 p 13 p 12; G 3 = ……. 11/1/20 20 3 units of delay for G 0, G 1, G 2, G 3 2 units of delay for P 0, P 1, P 2, P 3 arithmetic. 14

Fast Addition: Cascaded Carry Look-ahead (16 -bit): C L A 4 C 0 G

Fast Addition: Cascaded Carry Look-ahead (16 -bit): C L A 4 C 0 G 0 P 0 c 4 = G 0 + C 0 P 0 c 4 has 4 units of delay 4 -bit Adder 5 c 8 = G 1 + G 0 P 1 + C 0 P 1 5 units of delay for c 8, c 12, c 16 4 -bit Adder 5 c 12 = G 2 + G 1 P 2 + G 0 P 1 P 2 + C 0 P 1 P 2 G P 4 -bit Adder 11/1/20 20 c 8 c 16 =. . . c 12 arithmetic. 15

Carry Lookahead Homework You are required to calculate the performance of a 16 -bit

Carry Lookahead Homework You are required to calculate the performance of a 16 -bit Carry lookahead adder similar to the one discussed in class. The design has 2 options 1. assuming ripple carry is used inside each 4 -bit cell 2. Carry lookahead is used inside each 4 -bit cell • Both cases use carry lookahead at predicting 4 -bit boundary carries [c 4, c 8, c 12] • Draw a table showing the delay of each adder bit i. e. Sum 0 - Sum 15; as well as the carry at each stage of the design – for the 2 designs 11/1/20 20 arithmetic. 16

a 0 b 0 a 1 b 1 a 2 b 2 S 0

a 0 b 0 a 1 b 1 a 2 b 2 S 0 3 S 1 3 S 2 3 S 3 a 3 b 3 S 4 a 4 b 4 a 6 b 6 a 7 b 7 11/1/20 20 G 0 2 nd level carry lookahead P 0 4 units of delay a 5 b 5 8 -bit carry lookahead adder (4 -bit block is also CLA) 6 6 6 c 4= G 0 + c 0 P 0 6 5 c 5= g 4 + c 4. p 4 Delays 1 4 1 c 5 S 5 c 6 = c 6 S 6 c 7 = c 7 S 7 G 1 P 1 arithmetic. 17

8 -bit CLA – uses ripple carry inside 4 -bit block 0 a 0

8 -bit CLA – uses ripple carry inside 4 -bit block 0 a 0 b 0 a 1 b 1 a 2 b 2 a 3 b 3 a 5 b 5 a 6 b 6 a 7 b 7 11/1/20 20 2 Result 1 3 Result 2 5 Result 3 7 2 4 6 4 a 4 b 4 Result 0 c 4 Result 4 5 Result 5 7 Result 6 9 Result 7 11 2 nd level carry lookahead 6 8 10 arithmetic. 18

Additional MIPS ALU requirements • Mult, Mult. U, Div. U => Need 32 -bit

Additional MIPS ALU requirements • Mult, Mult. U, Div. U => Need 32 -bit multiply and divide, signed and unsigned • Sll, Sra => Need left shift, right shift arithmetic by 0 to 31 bits 11/1/20 20 arithmetic. 19

Multiply, Divide & Shift 11/1/20 20 arithmetic. 20

Multiply, Divide & Shift 11/1/20 20 arithmetic. 20

 • • • • MIPS arithmetic instructions Instruction add subtract add immediate add

• • • • MIPS arithmetic instructions Instruction add subtract add immediate add unsigned subtract unsigned add imm. unsign. no exceptions multiply unsigned divide Example Meaning add $1, $2, $3 $1 = $2 + $3 sub $1, $2, $3 $1 = $2 – $3 addi $1, $2, 100 $1 = $2 + 100 addu $1, $2, $3 $1 = $2 + $3 subu $1, $2, $3 $1 = $2 – $3 addiu $1, $2, 100 divide unsigned divu $2, $3 Move from Hi Move from Lo mfhi $1 mflo $1 11/1/20 20 mult $2, $3 multu$2, $3 div $2, $3 Comments 3 operands; exception possible + constant; exception possible 3 operands; no exceptions $1 = $2 + 100 + constant; Hi, Lo = $2 x $3 64 -bit signed product Hi, Lo = $2 x $3 64 -bit unsigned product Lo = $2 ÷ $3, Lo = quotient, Hi = remainder Hi = $2 mod $3 Lo = $2 ÷ $3, Unsigned quotient & remainder Hi = $2 mod $3 $1 = Hi Used to get copy of Hi $1 = Lo Used to get copy of Lo arithmetic. 21

MULTIPLY (unsigned) • Paper and pencil example : Multiplicand Multiplier Product 1000 A 1001

MULTIPLY (unsigned) • Paper and pencil example : Multiplicand Multiplier Product 1000 A 1001 B 1000 0000 a 3 b 1 0000 a 3 b 2 a 2 b 2 1000 a 3 b 3 a 2 b 3 a 1 b 3 01001000 a 3 b 0 a 2 b 0 a 1 b 0 a 0 b 0 a 2 b 1 a 1 b 1 a 0 b 1 a 1 b 2 a 0 b 3 • m bits x n bits = m+n bit product • Binary makes it easy: – 0 => place 0 – 1 => place a copy ( 0 x multiplicand) ( 1 x multiplicand) • 2 architectures – Fast Array MPY & Slow Shift & Add 11/1/20 20 arithmetic. 22

Signed Multiply – 2’s complement Multiplicand 1000 A = -8 Multiplier 1001 B =

Signed Multiply – 2’s complement Multiplicand 1000 A = -8 Multiplier 1001 B = -7 1 10000 1 ~a 3 b 0 a 2 b 0 a 1 b 0 a 0 b 0 1000 ~a 3 b 1 a 2 b 1 a 1 b 1 a 0 b 1 1000 ~a 3 b 2 a 2 b 2 a 1 b 2 a 0 b 2 11111 1 a 3 b 3 ~a 2 b 3 ~a 1 b 3 ~a 0 b 3 Product 00111000 = +56 ~ means bit is complimented 1 extra 1 is added to compensate Complementing & addition of 1, tricks to save complete sign extension every time 11/1/20 20 arithmetic. 23

Fast unsigned Multiply== Array Multiplier 0 Multiplicand A Aj Bi A 3 A 3

Fast unsigned Multiply== Array Multiplier 0 Multiplicand A Aj Bi A 3 A 3 A 2 A 1 0 A 1 0 A 0 B 1 A 0 B 2 A 0 B 3 Cell delays ? Product P P 7 P 6 P 5 P 4 P 3 P 2 P 1 P 0 Multiplier B • Can be adapted to accomodate signed MPY • Q: How much hardware for 32 bit multiplier? Critical path? 11/1/20 20 arithmetic. 24

Multiplication, using shift & Add 0 0 Multiplier operation 0 A 3 A 3

Multiplication, using shift & Add 0 0 Multiplier operation 0 A 3 A 3 P 7 P 6 A 2 P 5 A 2 A 1 P 4 0 A 3 A 2 A 1 0 A 0 B 1 A 0 B 2 A 0 P 3 B 0 B 3 P 2 P 1 P 0 • At each stage shift multiplicand left ( x 2) • Multiplier bit Bi determines : add in shifted multiplicand • Accumulate 2 n bit partial product at each stage 11/1/20 20 arithmetic. 25

Multiplication, using shift & Add • long-multiplication approach multiplicand multiplier product 1000 × 1001

Multiplication, using shift & Add • long-multiplication approach multiplicand multiplier product 1000 × 1001 1000 0000 1001000 Length of product is the sum of operand lengths 11/1/20 20 arithmetic. 26

Multiplication Hardware using shift & Add Initially 0 11/1/20 20 arithmetic. 27

Multiplication Hardware using shift & Add Initially 0 11/1/20 20 arithmetic. 27

Optimized Multiplier using shift & Add • Perform steps in parallel: add/shift 32 –

Optimized Multiplier using shift & Add • Perform steps in parallel: add/shift 32 – bit ALU, multiplicand n One cycle per partial-product addition n 11/1/20 20 ok, if frequency of multiplications is low arithmetic. 28

Multiply Algorithm Start Product 0 = 1 1. Test Product 0 = 0 1

Multiply Algorithm Start Product 0 = 1 1. Test Product 0 = 0 1 a. Add multiplicand to the left half of product & place the result in the left half of Product register Product Multiplicand 1: 2: 0000 0011 0010 0011 0001 1000 0001 1000 0000 1100 0000 0110 0010 0010 0010 2. Shift the Product register right 1 bit. 0000 0110 0010 Yes: 32 repetitions Done 11/1/20 20 32 nd repetition? No: < 32 repetitions arithmetic. 29

MIPS logical instructions • • • • Instruction Example and $1, $2, $3 or

MIPS logical instructions • • • • Instruction Example and $1, $2, $3 or or $1, $2, $3 xor $1, $2, $3 nor $1, $2, $3 and immediate andi $1, $2, 10 or immediate ori $1, $2, 10 xor immediate xori $1, $2, 10 shift left logical sll $1, $2, 10 shift right logical srl $1, $2, 10 shift right arithm. sra $1, $2, 10 shift left logical sllv $1, $2, $3 shift right logical srlv $1, $2, $3 shift right arithm. srav $1, $2, $3 11/1/20 20 Meaning $1 = $2 & $3 $1 = $2 | $3 $1 = $2 $3 $1 = ~($2 |$3) $1 = $2 & 10 $1 = $2 | 10 $1 = ~$2 &~10 $1 = $2 << 10 $1 = $2 >> 10 $1 = $2 << $3 $1 = $2 >> $3 Comment 3 reg. operands; Logical AND 3 reg. operands; Logical OR 3 reg. operands; Logical XOR 3 reg. operands; Logical NOR Logical AND reg, constant Logical OR reg, constant Logical XOR reg, constant Shift left by constant Shift right (sign extend) Shift left by variable Shift right arith. by variable arithmetic. 30

How shift instructions are implemented Two kinds: logical-- value shifted in is always "0"

How shift instructions are implemented Two kinds: logical-- value shifted in is always "0" msb lsb "0" arithmetic-- on right shifts, sign extend msb lsb "0" shift right logical by 2 1100 1011 shift right arithmetic by 2 1011 1110 instruction can request 0 to 32 bits to be shifted! 11/1/20 20 arithmetic. 31

ARM : : Barrel Shifter: Operand 1 Operand 2 Barrel Shifter – Shift value

ARM : : Barrel Shifter: Operand 1 Operand 2 Barrel Shifter – Shift value can be either be: • 5 bit unsigned integer • Specified in bottom byte of another register. Example: • ADD r 0, r 1, r 2, LSL#7 Semantics: r 2 is shifted left by 7 & then added to r 1 ALU Result 11/1/20 20 2/1 arithmetic. 32

Barrel Shifter, used in ICs Shift Right using one transistor per switch SR 3

Barrel Shifter, used in ICs Shift Right using one transistor per switch SR 3 SR 2 SR 1 SR 0 D 3 D 2 A 6 D 1 A 5 D 0 A 4 A 3 11/1/20 20 A 2 A 1 A 0 arithmetic. 33

Barrel Shifter, used in ICs Shift ……Left SR 2 & right SR 1 SR

Barrel Shifter, used in ICs Shift ……Left SR 2 & right SR 1 SR 0 SL 1 SL 2 SL 3 D 2 A 5 D 1 A 4 D 0 A 3 A 2 A 1 A 0 arithmetic. 34

Summary: Multiply & Shift • Multiply: successive refinement to see final design – 32

Summary: Multiply & Shift • Multiply: successive refinement to see final design – 32 -bit Adder, 64 -bit shift register, 32 -bit Multiplicand Register • Fast multiply Array multiplier • Shifter: success refinement 1/bit at a time shift register to barrel shifter 11/1/20 20 arithmetic. 35

Floating Point Arithmetic • How to represent – numbers with fractions, e. g. ,

Floating Point Arithmetic • How to represent – numbers with fractions, e. g. , 3. 1416 – very small numbers, e. g. , . 00001 – very large numbers, e. g. , 3. 15576 109 11/1/20 20 • Fixed point • Floating point: a number system with floating decimal point • Normalized numbers: no leading 0’s , single digit before decimal point 1. 0 x 3. 1557 x 35 0. 03 arithmetic. 36

Floating Point Notation – IEEE 754 FP exponent Sign, magnitude decimal point 6. 02

Floating Point Notation – IEEE 754 FP exponent Sign, magnitude decimal point 6. 02 x 10 Mantissa 23 1. 673 x 10 -24 radix (base) Sign, magnitude IEEE F. P. ± 1. M x 2 e - 127 • Issues: – – – 11/1/20 20 Arithmetic (+, -, *, / ) Representation, Normal form Range and Precision, Single, Double Rounding Exceptions (e. g. , divide by zero, overflow, underflow) arithmetic. 37

Floating-Point Arithmetic Floating point numbers in IEEE 754 standard: 1 8 23 single precision

Floating-Point Arithmetic Floating point numbers in IEEE 754 standard: 1 8 23 single precision E sign S M exponent: excess 127 binary integer actual exponent is e = E - 127 S E-127 N = (-1) 2 (1. M) 0 < E < 255 0 = 0 0000 0. . . 0 mantissa: sign + magnitude, normalized binary significand w/ hidden integer bit: 1. M 127 -1. 5 = 1 01111111 10. . . 0 Numbers that can be represented is in the range: 2 -126 (1. 0) to 2 127 (2 - 2 -23 ) Double Precision IEEE 754 [64 -bits] Exponent = 11 bits, Bias = 1023, Mantissa = 52, Sign= 1 bit 11/1/20 20 arithmetic. 38

Exponent Bias used to simplify comparisons • If we use 2’s complement, not good

Exponent Bias used to simplify comparisons • If we use 2’s complement, not good for sorting and comparison 0000 most negative exponent 11/1/20 20 1111 most positive exponent arithmetic. 39

Floating Point – Example review • • Represents – bias = 127 for 32

Floating Point – Example review • • Represents – bias = 127 for 32 -bit word – S = 1: negative 0: positive or zero • Example (from fraction to floating point representation) -0. 75 11/1/20 20 arithmetic. 40

Floating-Point Example - review • Represent – 0. 75 – – – 0. 75

Floating-Point Example - review • Represent – 0. 75 – – – 0. 75 = (– 1)1 × 1. 12 × 2– 1 S=1 Fraction = 1000… 002 Exponent = – 1 + Bias = 126 • Single: – 1 + 127 = 126 = 011111102 • Double: – 1 + 1023 = 1022 = 01111102 • Single: 1011111101000… 00 • Double: 101111101000… 00 11/1/20 20 arithmetic. 41

Addition – Multiply Algorithm issues For addition (or subtraction) : (1) compute Ye -

Addition – Multiply Algorithm issues For addition (or subtraction) : (1) compute Ye - Xe (getting ready to align binary point) Xe-Ye (2) right shift Xm that many positions to form Xm 2 Xe-Ye (3) compute Xm 2 + Ym (4) for multiply, doubly biased exponent must be corrected: Xe = 7 Ye = -3 Excess 8 11/1/20 20 = 7+8 Xe = 1111 = 15 = -3 + 8 Ye = 0101 = 5 4+8+8 10100 20 extra subtraction step of the bias amount arithmetic. 42

Floating Point Addition • • • 11/1/20 20 Step 1: align, round Step 2:

Floating Point Addition • • • 11/1/20 20 Step 1: align, round Step 2: add Step 3: normalize, check overflow or underflow Step 4: round Example: arithmetic. 43

Floating Point Multiplication • Step 1: add exponents, subtract bias, Mpy mantissas • Step

Floating Point Multiplication • Step 1: add exponents, subtract bias, Mpy mantissas • Step 2: normalize and check over/underflow • Step 3: round • Step 4: check sign • Example: 11/1/20 20 arithmetic. 44

FP Adder Hardware • more complex than integer adder • Doing it in one

FP Adder Hardware • more complex than integer adder • Doing it in one clock cycle - takes too long – Much longer than integer operations – Slower clock would penalize all instructions • FP adder usually takes several cycles – pipelined 11/1/20 20 arithmetic. 45

FP Adder Hardware Exponents compared Step 1 Smaller number shifted right Step 2 Result

FP Adder Hardware Exponents compared Step 1 Smaller number shifted right Step 2 Result iterated until normalized Step 3 Step 4 11/1/20 20 arithmetic. 46

Floating Point: Overflow & Underflow • Exponent too large to be represented • Underflow:

Floating Point: Overflow & Underflow • Exponent too large to be represented • Underflow: negative exponent too small to fit in exponent field 11/1/20 20 arithmetic. 47

Summary of Floating Point Arithmetic • IEEE floating point standard 32 bit and 64

Summary of Floating Point Arithmetic • IEEE floating point standard 32 bit and 64 bit • Converting decimal numbers to floating point and vice versa • Overflow and underflow • Floating point add and multiply 11/1/20 20 arithmetic. 48