REPRESENTATION OF REAL NUMBER Presented by Pawan yadav

REPRESENTATION OF REAL NUMBER Presented by: Pawan yadav Puneet vinayak

CONTENTS: Ø Ø Ø Ø Ø Floating Point Numbers Decimal Binary conversion Floating point representation Mantissa Exponent Normalization IEEE Floating Point Representation Floating point airhtematic Error in floating point airthematic

FLOATING POINT NUMBERS In computer science real number is also called floating point number. In the decimal system, a decimal point (radix point) separates the whole numbers from the fractional part Examples: 37. 25 ( whole=37, fraction = 25) 123. 567 10. 12345678

FLOATING POINT NUMBERS For example, 37. 25 can be analyzed as: 101 100 Tens Units 3 7 10 -1 10 -2 Tenths Hundredths 2 5 37. 25 = 3 x 10 + 7 x 1 + 2 x 1/10 + 5 x 1/100

BINARY EQUIVALENT In the binary representation of a floating point number the column values will be as follows: … 26 25 24 23 22 21 20. 2 -1 2 -2 2 -3 … 64 32 16 8 4 2 1. 1/2 1/4 1/8 … 64 32 16 8 4 2 1. . 5 2 -4 … 1/16 … . 25. 125. 0625…

$DECIMAL BINARY CONVERSION Repeatedly multiply fraction by two until fraction becomes zero. 0. 8125$

DECIMAL BINARY CONVERSION Repeatedly multiply fraction by two until fraction becomes zero. 0. 8125 1. 625 0. 625 1. 25 0. 25 0. 5 1. 0

SCIENTIFIC NOTATION OF FLOATING NUMBERS Decimal: -123, 000, 000 -1. 23 × 1014 0. 000 000 000 123 +1. 23× 10 -16 Binary: 1100 0000 1. 1011× 214 -0. 0000 0001 1011 -1. 1101 × 2 -16

FLOATING POINT NUMBER REPRESENTATION If x is a real number then its normal form representation is: x = f • Base E where f : mantissa E: exponent Example: 125. 3210 = 0. 12532 • 103 mantissa - 125. 3210 = - 0. 12532 • 103 0. 054610 = 0. 546 • 10 – 1

NORMALIZED AND UNNORMALIZED

NORMALIZATION PROCESS

FLOATING POINT FORMAT FOR BINARY NUMBERS

IEEE FLOATING POINT REPRESENTATION – – more exponent bits greater range more significant bits greater accuracy

IEEE FLOATING POINT REPRESENTATION n The first, or leftmost, field of our floating point representation will be the sign bit: n 0 for a positive number, n 1 for a negative number.

IEEE FLOATING POINT REPRESENTATION n n The second field of the floating point number will be the exponent. Since we must be able to represent both positive and negative exponents, we will use a convention which uses a value known as a bias of 127 to determine the representation of the exponent. n n n An exponent of 5 is therefore stored as 127 + 5 or 132; an exponent of -5 is stored as 127 + (-5) OR 122. The biased exponent, the value actually stored, will range from 0 through 255. This is the range of values that can be represented by 8 -bit, unsigned binary numbers.

IEEE FLOATING POINT REPRESENTATION n The mantissa is the set of 0’s and 1’s to the left of the radix point of the normalized (when the digit to the left of the radix point is 1) binary number. n n ex: 1. 00101 X 23 The mantissa is stored in a 23 bit field,

NORMALIZING NUMBERS Example: 134. 1510 = 0. 13415 x 10 0. 002110 = 0. 21 x 10 3 3 -2 101. 11 B =. 1011 x 2 or 1. 011 x 2 2 (hidden 1) -1 or 1. 1 x 2 -2 (hidden 1) 2 AB. CDH=. ABCD x 16 -2 0. 00 AC =. AC x 16 0. 011 B =. 11 x 2 H Note that the concept of a hidden 1 only applied to binary.

CONVERTING DECIMAL FLOATING POINT VALUES TO STORED IEEE STANDARD VALUES. Example: Find the IEEE FP representation of 40. 15625. Step 1. Compute the binary equivalent of the whole part and the fractional part. ( convert 40 and. 15625. to their binary equivalents) 40. 1562510 = 101000. 001012

CONVERTING DECIMAL FLOATING POINT VALUES TO STORED IEEE STANDARD VALUES. Step 2. Normalize the number by moving the decimal point to the right of the leftmost one. 101000. 00101 = 1. 0100000101 x 25 Step 3. Convert the exponent to a biased exponent 127 + 5 = 132 ==> 13210 = 100001002

CONVERTING DECIMAL FLOATING POINT VALUES TO STORED IEEE STANDARD VALUES. Step 4. Store the results from above Sign Exponent (from step 3) Mantissa ( from step 2) 0 100001000001010. . 0

CONVERT 10. 37 TO SINGLE PRECISION FLOATING POINT

Floating point arithmetic

FLOATING-POINT ADDITION Assume 4 decimal digit for mantissa 23

FLOATING POINT SUBTRACTION (USING 4 DIGIT MANTISSA) Addition must be of terms of the same scale: 0. 2361 106 - 0. 1455 104 0. 2361 106 - 0. 001455 106 {both 106} (0. 2361 - 0. 001455) 106 0. 147861 106 0. 234645 106 0. 2346 106 {4 digit mantissa}

REAL NUMBER MULTIPLICATION (USING 4 DIGIT MANTISSA) Multiplication problem is in the mantissa (0. 2361 102) (0. 1455 104) 0. 2361 0. 1455 102+4 {add indices} 0. 03435255 106 = 0. 3435255 105 0. 3435 105 {4 digit mantissa} Notice that multiplication must work from the largest digit downwards since at some point the number is going to have to be truncated.

REAL NUMBER DIVISION (USING 4 DIGIT MANTISSA) (0. 2361 102) /(0. 1455 104) (0. 2361 /0. 1455) 102 -4 {sub indices} 1. 6226804 10 -2 = 0. 3435255 105 0. 16226804 10 -1 0. 1623 10 -1 {4 digit mantissa}

ERRORS IN FLOATING POINT ARITHMETIC Round off error Ex- 5. 6999=5. 7 7. 238=7. 24 Truncation error 4. 67444444=4. 674 5. 45676767=5. 4567