Binary Real Numbers Introduction n n Computers must

Binary Real Numbers

$Introduction n n Computers must be able to represent real numbers (numbers w/ fractions)$

Introduction n n Computers must be able to represent real numbers (numbers w/ fractions) Two different ways: q q n Fixed-point Floating-point NOTE: Everything in binary uses powers of two

Decimal Review n Digits to the right of the decimal point correspond to negative powers of 10 102 101 100 . 10 -1 10 -2 100 10 1 . 0. 1 0. 01

Binary Fractions 2 -1 2 -2 0. 5 0. 25 1/2 1/4 2 -3 2 -4 2 -5 0. 0312 0. 125 0. 0625 5 1/8 1/16 1/32 2 -6 0. 00156 25 1/64

Fixed Point Notation 1. 2. Multiply each 1 by the corresponding power of 2 Add up the resulting powers of 2 Example: 11. 012 = 2 + 1 + ¼ = 3. 2510 00111. 0102 = 4 + 2 + 1 + ¼ = 7. 2510

Floating-Point Notation n n Floating-point notation is essentially the computer’s way of storing a number that has been normalized 3 different parts of any number: q q q n Mantissa: normalized number Exponent: power to which the base is raised Sign: of both mantissa and exponent Decimal Example: 12. 5 = 0. 125 x 102 normalized!

Normalization Steps 1. 2. 3. Beginning with a fixed point number Normalize the number such that the radix point (decimal point) is all the way to the left (produces the mantissa) Multiply the resulting number by the base raised to an exponent

Floating-Point Example What is 12. 5 in floating-point representation? 1. Convert 12. 5 to binary fixed point 12. 510 = 1100. 12 2. 3. Normalize the number by moving the radix point, producing the mantissa 1100. 12 = 0. 11001 * 24 Fill in the bits for each of the three parts of any real number: 1. 2. 3. 4. Sign (2 bits) Mantissa (# bits varies) Exponent (# bits varies) NOTE: 2’s complement may be applied to the mantissa or the exponent if either are negative

Placing the Bits n Assume you have the following: q q 1 bit for the mantissa sign 8 bits for the mantissa 1 bit for the exponent sign 6 bits for the exponent SM M M M M S E E E E n Example: 0. 11001 * 24 0 00011001 0 000100

Another Example Convert -12. 510 to binary: 1. Convert 12. 5 to fixed point 01100. 1 2. Normalize 0. 11001 * 24 3. Convert exponent base to binary: 4 0100 4. 2 s complement the mantissa by flipping bits and adding 1: 011001 100111 5. Final number 1 00111 0100

Upper & Lower Bounds n Assume you have the following: q q n n n 1 bit for the mantissa sign 8 bits for the mantissa 1 bit for the exponent sign 6 bits for the exponent What is the upper bound for the floating-point number? What is the lower bound for the floating-point number? What happens if we convert a floating-point number to an integer?

Integers vs. Floating-point n integers: q q n smaller range than floating-point all numbers within the range are 100% accurate floating-point q q large range of numbers not all numbers within the range can be represented accurately n Example: 2. 9999999 repeating

Possible Errors n truncation error q n overflow error q n round off errors using floating-point numbers because not all real numbers can be represented accurately attempting to represent a number that is greater than the upper bound for the given number of bits underflow error q attempting to represent a number that is less than the lower bound for the given number of bits