CSE 541 Numerical Methods Machine Representation of Numbers

CSE 541 – Numerical Methods Machine Representation of Numbers

Blazing Fast • Computers are extremely fast at performing simple arithmetic. – Count from one to 1, 000, 000 by one - or – Execute on a Pentium-4 2 GHz machine: int i = 1; While (1) { printf( “%d/n”, i++ ); } – Which reaches one trillion faster? 10 September 2020 OSU/CIS 541 2

Blazing Fast • Assume a human count on average one number per second (bad assumption). • Implies 1 trillion seconds • Or about 32, 000 years!!!! 10 September 2020 OSU/CIS 541 3

Blazing Fast • Assume a human count on average one number per second (bad assumption). • Implies 1 trillion seconds • Or about 32, 000 years!!!! • Assume that a 2 GHz machine can increment by one each clock cycle. • Implies 500 seconds!!!! 10 September 2020 OSU/CIS 541 4

Machine Precision • What is our current National Debt? – (see http: //www. brillig. com/debt_clock/) • What is the annual revenue of a Fortune 100 company? • How many molecules are in one mole of gas (under ideal assumptions) • How many pixels are displayed in a two hour HDTV movie? • What is the current balance of Bill Gate’s checking account? • How many ounces of Pepsi are consumed per year? • How many bytes of storage does the Graphics Group own? 10 September 2020 OSU/CIS 541 5

Machine Precision • The point is that we live in a world with a wide range of scales. • Computers have limited capabilities. • Let me repeat that: Computers have limited capabilities. 10 September 2020 OSU/CIS 541 6

Machine Precision Computers have limited capabilities!!! 10 September 2020 OSU/CIS 541 7

Accuracy • Do we really need that much accuracy? – American Express thinks so – Do. D thinks so (see web article on Patriot missles) 10 September 2020 OSU/CIS 541 8

Accuracy and Range • 16 -bit integers (common about 5 years ago) 216 = 65, 536 • 32 -bit integers 32 2 = 4, 294, 967, 296 • Only 31 -bits if you need signed integers. • 64 -bit integers 264 1. 84 1019 • Still no Avagadro’s number: 6. 022*10 10 September 2020 OSU/CIS 541 23 9

Accuracy and Range • 16 -bit integers (common about 5 years ago) 216 = 65, 536 • 32 -bit integers << 1, 000, 000 • Only 31 -bits if you need signed integers. • 64 -bit integers 32 2 = 4, 294, 967, 296 264 = 1. 84 1019 • Still no Avagadro’s number 6. 022*10 10 September 2020 OSU/CIS 541 23 10

Blazing Fast • A 2 GHZ 32 -bit machine may be blazing fast, but it will never reach one trillion either, if care isn’t taken!!! • Forever >> 32, 000 years – Humans win!!! 10 September 2020 OSU/CIS 541 11

Accuracy and Standards • What is int i and what is its range? • ANSI C standard – int – machine dependent – short – machine dependent Not much of a standard, huh? 10 September 2020 OSU/CIS 541 12

Standard Range • Corba – long – 32 -bit signed – long – 64 bit signed • Java - standardized 10 September 2020 OSU/CIS 541 13

Advice • Never use int, long in C / C++ • Use the preprocessor to define your own types #define CISint \ I want this to be 32 -bit #define CISshort \ 16 -bit signed integer • Hence, if you port to another architecture and need to change it, only change it in one place. • Likewise, use each packages types: gl. Float, … 10 September 2020 OSU/CIS 541 14

Numbers • • Whole numbers: 0, 1, 2, … Signed integers: …, -1, 0, 1, 2, … Rational numbers: a/b, 1/3, etc. Irrational numbers: sqrt(2), e, pi, etc. • Is there a difference between 0. 1 and 1/3? 10 September 2020 OSU/CIS 541 15

Representing Real Numbers • In Base-10, we can express any number as a sum of weighted power’s of ten. • Hence: 10 September 2020 OSU/CIS 541 16

Representing Real Numbers • Using Base-2 or binary, what are our series? • What coefficients can we have? 10 September 2020 OSU/CIS 541 17

Limiting Precision • Truncate – Truncate to what? • Truncate to 8 significant digits. – Okay: 3. 1415926535897932384626433832795 3. 1415926 10 September 2020 OSU/CIS 541 18

Limiting Precision • Round to nearest value: 3. 1415926535897932384626433832795 3. 1415927 • Rules: – If value after last retained digit is > 5, then round-up. – If value after last retained digit is < 5, then truncate. – If value after last digit is =5, then: • If all remaining digits are not zero, round-up. • Else, randomly round-up or truncate. 10 September 2020 OSU/CIS 541 19

Error • We need to truncate rational and irrational numbers to make them more manageable. • Absolute Error: – Typically want the absolute value: • Relative Error: • Can I have a very small Absolute error, but a large relative error? 10 September 2020 OSU/CIS 541 20

Condition of a Problem • For a problem with input (data) x and output y, y=F(x). The problem is said to be wellconditioned if small changes in x, lead to small changes in y. • Otherwise, we say the problem is illconditioned. 10 September 2020 OSU/CIS 541 21

Stability of an Algorithm • Stability indicates the sensitivity of an algorithm for solving a problem. • An algorithm is said to be stable if small changes in the input x lead to small changes in the output y. • Otherwise, the algorithm is said to be unstable. 10 September 2020 OSU/CIS 541 22

Condition and Stability • Condition => data • Stability => algorithm • Ill-conditioned – very hard to get a good result with even the best algorithm. • Stable – given good data (not ill-conditioned), the algorithm will not yield drastically different results if round-off or small noise is added to the input data. 10 September 2020 OSU/CIS 541 23

Precision x =. 256834*105 • The digit 2 is the most significant digit, while 4 is the least significant digit. • Precision should not be confused with accuracy. Accuracy is how close your solution is to the actual solution. Missed the bulls-eye by 2 inches. • Precision is how good your estimate of the accuracy is, 2. 0”± 0. 003 10 September 2020 OSU/CIS 541 24

Precision • If you miss the bulls-eye, but repeatedly hit the same location, then it is precise, but inaccurate. • For instance, using Taylor’s series for the approximation of sin(x), I can obtain a precise number for values of x larger than 2, but the solution would be inaccurate. • In a computer, precision is usually how many digits of accuracy your machine representation has (number of bits in the mantissa). 10 September 2020 OSU/CIS 541 25

Loss of Significance • Consider the error for x-y using 5 decimal digits of precision: x =. 3721448693 y =. 3720214371 x’ =. 37214 y’ =. 37202 x’-y’ =. 00012 x-y =. 0001234322 10 September 2020 OSU/CIS 541 26

Loss of Significance • The relative error is: • However, the relative error of x’ and y’ is only 1. 3*10 -5. • We lost 3 significant digits. 10 September 2020 OSU/CIS 541 27

Loss of Precision Theorem • Examine the precision loss for x-y. • Let x and y be normalized floating-point numbers, with x > y > 0. If for some positive integers p and q, then at most p and at least q significant binary bits are lost in the subtraction. • Rule: The closer two numbers, the greater the loss of significance. 10 September 2020 OSU/CIS 541 28

Avoiding loss of Precision • Use double precision or higher • Modify the calculations to remove subtraction of numbers close together. • Consider: as x approaches 0. • Reorder to remove the subtraction: 10 September 2020 OSU/CIS 541 29

How do they do that? • Microsoft Windows ships with a simple calculator for its 32 -bit operating system. 96 • What would you expect, if you typed in 2 ? – or 96 – 2 = 79, 228, 162, 514, 264, 337, 593, 543, 950, 336 10 September 2020 OSU/CIS 541 30

Performance. vs. Accuracy • With a 2. 2 GHz computer, you can do a lot of number crunching (two ops per clock cycle). • Real-time applications – Audio must be processed at 4 KHz – One trillion time steps of crash simulation • Decide whether accuracy or performance is more important • User interactivity (e. g. , calculator): – No need to calculate things faster than the user’s fat fingers can type them. • Use slower, but more accurate algorithms. 10 September 2020 OSU/CIS 541 31

Integers • Exact to within machine’s range: – ± 2 N-1 -1 • See the /usr/include/limits. h file 10 September 2020 OSU/CIS 541 32

Fixed-Point Numbers • Used to store floating point numbers with a fixed range. • Say I need values for the surface temperatures of the earth. Restrict the values to the range (999, 999). • I could use signed integers, but would only have three significant digits. • I can also represent any temperature with fixed point numbers of the form xxxxx to allow for eight significant digits. 10 September 2020 OSU/CIS 541 33

Implementing Fixed-point Arithmetic • Fixed point is easy to implement in most cases, You simply ignore the decimal point for all operations, use integers for your calculations and then put the decimal point back in when its needed (e. g. , for I/O). 31. 45330 => 31045330 (need to remember to divide by 100, 000 later). 10 September 2020 OSU/CIS 541 34

Floating Point Numbers • Split into three parts: – Sign – Mantissa – Exponent • +6. 024*1023 10 September 2020 OSU/CIS 541 35

Floating-Point Numbers • CPU architects have choices on how many bits to allocate to each component. • Sign() – 1 -bit • Exponent – n-bits (Determines the range of numbers) • Mantissa – whatever is left (Determines the precision of the numbers) 10 September 2020 OSU/CIS 541 36

IEEE Floating Point • Single-precision: • Sign() – 1 -bit (s) • Exponent – 8 -bits (e) • Mantissa – 23 -bits (m) • Notes: • +0 and – 0 are represented differently • + and - (Na. N’s) have special representations • The mantissa is in normalized form, meaning the first digit is a one. As such, it is not stored and we have 24 -bits of precision. 10 September 2020 OSU/CIS 541 37

Limits for Floats • Look at the C include file float. h #define */ #define #define #define DBL_DIG DBL_EPSILON 15 /* # of decimal digits of precision */ 2. 2204460492503131 e-016 /* smallest such that 1. 0+DBL_EPSILON != 1. 0 DBL_MANT_DIG DBL_MAX_10_EXP DBL_MAX_EXP DBL_MIN_10_EXP DBL_MIN_EXP _DBL_RADIX _DBL_ROUNDS 53 1. 7976931348623158 e+308 1024 2. 2250738585072014 e-308 (-307) (-1021) 2 1 /* /* /* #define */ #define #define FLT_DIG FLT_EPSILON 6 1. 192092896 e-07 F /* # of decimal digits of precision */ /* smallest such that 1. 0+FLT_EPSILON != 1. 0 FLT_GUARD FLT_MANT_DIG FLT_MAX_10_EXP FLT_MAX_EXP FLT_MIN_10_EXP FLT_MIN_EXP 0 24 3. 402823466 e+38 F 38 128 1. 175494351 e-38 F (-37) (-125) /* /* 10 September 2020 OSU/CIS 541 # of bits in mantissa */ max value */ max decimal exponent */ max binary exponent */ min positive value */ min decimal exponent */ min binary exponent */ exponent radix */ addition rounding: near */ # of bits in mantissa */ max value */ max decimal exponent */ max binary exponent */ min positive value */ min decimal exponent */ min binary exponent */ 38

Divide by zero • What would happen when you ran this program? float x = 0. 230134789; float y = 0. 230134788; float inverse; inverse = 1. 0 / (x-y); return 0; 10 September 2020 OSU/CIS 541 39

Resolve and Real Numbers • One of the beautiful benefits of C++ is the ability to redefine or overload functions in a class hierarchy. • With the CSE department’s Resolve language, you used the type, Real. • This class (as well as Resolve) is used to ensure that you do not divide by zero. • Therefore, it overloads the operator”/” method, performs a quick (? ? ? ) check and then performs the divide (if safe). 10 September 2020 OSU/CIS 541 40

Designing An Extended Precision Class • C++ (and other object-oriented languages like Java), allow most of the basic language to be overloaded. • You can use this feature to define new arithmetic for: – limiting numbers: 0. 0 -> 1. 0, 32 -63, … – embed debugging information: “print if value=pi – Keep track of the error associated with the operation. This is called Interval Analysis. • Let’s look at designing an extended precision class. 10 September 2020 OSU/CIS 541 41

Extended Precision with C++ • Let’s say we want over 1000 -bits of precision for our application. • Furthermore, let’s assume our numbers encode temperatures and only range from -273° to 65, 000 ° C. • Key questions: – What should our basic number format be? – What operations are we supporting? – How do we implement those operations? 10 September 2020 OSU/CIS 541 42

Decimal Type • C# has a new decimal type: – 128 -bit format – Base-10 arithmetic! – 28 significant figures • That should handle Bill Gates check-book!!! – Fixed-point, floating point or integer? – Bit layout? 10 September 2020 OSU/CIS 541 43

Other C# features • Keywords checked or unchecked. – Controls whether to throw an Overflow. Exception or not. – For integer arithmetic only. – int wont. Throw = unchecked( System. Int 32. Max. Value + 1); – int will. Throw = checked( System. Int 32. Max. Value + 1); – Can also be used in blocks or as a compiler option. • Many exceptions can be thrown and caught. • Implicit type casting only happens when widening. 10 September 2020 OSU/CIS 541 44