Numerical Analysis EE NCKU TienHao Chang Darby Chang

Numerical Analysis EE, NCKU Tien-Hao Chang (Darby Chang) 1

In the previous slide n Why numerical methods? – differences between human and computer – a very simple numerical method n What is algorithm? – definition and components – three problems and three algorithms n Convergence – compare rate of convergence 2

In this slide n Error (motivation) n Floating point number system – difference to real number system – problem of roundoff n Introduced/propagated error n Focus on numerical methods – three bugs 3

Let’s start from error n n Numerical methods are generally designed to determine approximation solutions 3 categories of error types – modeling: made when you decide the algorithm – discretization/truncation: conversion from continuous to discrete and/or truncation of an infinite series – roundoff/data: not due to the formulation of a numerical method, caused by the data representation (in computer) 4

Can be analyzed n n Numerical methods are generally designed to determine approximation solutions 3 categories of error types – modeling: made when you decide the algorithm – discretization/truncation: conversion from continuous to discrete and/or truncation of an infinite series – roundoff/data: not due to the formulation of a numerical method, caused by the data representation (in computer) 5

Should be prevented n n Numerical methods are generally designed to determine approximation solutions 3 categories of error types – modeling: made when you decide the algorithm – discretization/truncation: conversion from continuous to discrete and/or truncation of an infinite series – roundoff/data: not due to the formulation of a numerical method, caused by the data representation (in computer) 6

1. 3 Mathematics on the Computer Floating Point Number Systems 7

8

Restriction of d 1 must not be zero (except when the number being represented is 0) 9

Floating point vs. real number n Discrete vs. continuous – continuous means that between any two numbers, there are infinitely many other numbers n Finite vs. infinite – number of element and range of values – a floating point number system contains its smallest/largest element • underflow/overflow 10

Any Questions? 11

Floating point vs. real number n Nonuniform vs. uniform – real numbers are uniformly distributed – in a floating point number system, the elements **** are more closely spaced • think about the difference between two hint adjacent elements while the exponent changes 12

Floating point vs. real number n Nonuniform vs. uniform – real numbers are uniformly distributed – in a floating point number system, the elements **** are more closely spaced • think about the difference between two adjacent elements while the exponent changes 13

Floating point vs. real number n Nonuniform vs. uniform – real numbers are uniformly distributed – in a floating point number system, the elements near the zero are more closely spaced • think about the difference between two adjacent elements while the exponent changes 14

Floating point system is discrete, finite and nonuniform 15

Roundoff error n n When the number is outside the system Select an element to represent the number – chop – round n A number to its floating point equivalent – y → fl(y) 16

17

18

Roundoff error n n When the number is outside the system Select an element to represent the number – chop – round n A number to its floating point equivalent – y → fl(y) 19

Formal definition 20

An example 21

In general case (chopped) 22

In general case (chopped) 23

Machine precision/epsilon n n The error bound is independent of the number, y It depends on – base (β) – the number of digits (k) n n The bound is a function of the hardware implementation Cause of roundoff error 24

Formal definition 25

Another term about precision 26

27

So far, we talked about floating point number systems in abstract 28

Then, what systems are we likely to encounter in practice? 29

Real floating point system n 1970 s – begun to develop a standard binary floating point numbers to eliminate inconsistencies n 1985 – IEEE – Binary Floating Point Arithmetic Standard 754 n The IEEE Standard – F(2, 24, -125, 128), single precision – F(2, 53, -1021, 1024), double precision 30

IEEE standard single precision 31

1. 4 Mathematics on the Computer: Floating Point Arithmetic 32

Motivation n n Floating point arithmetic stands for the mathematics on the computer, but why should we know that? The IEEE Standard – 5. 96 x 10 -18 – seems pretty accurate n However, 33

Numerical methods perform a sequence of calculations on computer, where each operation introduces some roundoff error 34

http: //www. radgraphics. net/images/main/atomic%20 explosion%20 -%204. jpg when they are accumulated 35

Typical arithmetic n Three steps – operand → its floating point equivalent – the exact arithmetic – result → its floating point equivalent 36

37

Not associative n (0. 1329+1. 543)+23. 21=1. 676+23. 21=24. 89 n 0. 1329+(1. 543+23. 21)=0. 1329+24. 75=24. 88 n We should perform the arithmetic in question ***** order to obtain the most accurate result 38

All intermediate results have been rounded 39

Any Questions? 40

Not associative n (0. 1329+1. 543)+23. 21=1. 676+23. 21=24. 89 n 0. 1329+(1. 543+23. 21)=0. 1329+24. 75=24. 88 n We should perform the arithmetic in ***** order to obtain the most accurate result 41

Not associative n (0. 1329+1. 543)+23. 21=1. 676+23. 21=24. 89 n 0. 1329+(1. 543+23. 21)=0. 1329+24. 75=24. 88 n We should perform the arithmetic in Ascending order to obtain the most accurate result 42

In FP arithmetic, always notice the number of significant digits and the least significant bits 43

Not distributive 44

Accumulation of roundoff error 45

46

Introduced/propagated error 47

Propagated error can be large even if the introduced error is small 48

A notation in the analysis 49

In multiplication 50

In division 51

n n The relative error propagates slowly The absolute error can grow rapidly, when multiplying by a large number or dividing by a small number 52

Propagated error in addition and subtraction 53

In addition and subtraction 54

Absolute vs. relative error n n Multiplication and division may result large absolute error Addition and subtraction may result large relative error – more crucial – cancellation error • two nearly equal numbers are subtracted n Algorithms should avoid the subtraction of nearly equal numbers 55

http: //www. dianadepasquale. com/Thinking. Monkey. jpg Recall that 56

Should be prevented n n Numerical methods are generally designed to determine approximation solutions 3 categories of error types – modeling: made when you decide the algorithm – discretization/truncation: conversion from continuous to discrete and/or truncation of an infinite series – roundoff/data: not due to the formulation of a numerical method, caused by the data representation (in computer) 57

To prevent, we need to know the floating point system 58

http: //rinat. relcom. net/Gallery/slides/bug. jpg Bug 1 59

60

± be careful 61

http: //thomashawk. com/hello/209/1017/1024/Jackson%20 Running. jpg In action 62

In action 63

Analysis n The larger root – 239. 4 (actual root: 239. 4246996) – is the floating point equivalent of the actual root n The smaller root – 0. 15 (actual root: 0. 1253003555) – nearly 20% relative error 64

Any Questions? 65

An intuitive question n n How to solve the quadratic formula problem? Reformulate the calculation of the smaller root hint 66

67

68

http: //rinat. relcom. net/Gallery/slides/bug. jpg Bug 2 69

70

Multiplier -1/6 The world is cruel : p You got -1. 667 71

72

http: //i 5. tinypic. com/4 yqudc 7. jpg After one pass of Gaussian elimination 73

74

The next multiplier fl(-3. 333/0. 0001) 75

http: //www. radgraphics. net/images/main/atomic%20 explosion%20 -%204. jpg -33330 76

77

Cascade of effects n n n Cancellation error led to a small pivot element A small pivot led to a large multiplier A large and then led to loss of significant digits 78

4. 167 disappeared 79

http: //rinat. relcom. net/Gallery/slides/bug. jpg Bug 3 80

Values of a function n n Even evaluating a function can prove difficult f(x) = ex – cosx – x, where x → 0 – ex → 0 – cosx → 0 81

82

83

How reformulate When seeing cosx, sinx and ex , Taylor series 84

Reforming with Taylor series 85

86

More precision n These bugs are under F(10, 4, -, -) n Just add more precision – FORTRAN REAL*8 → REAL*16 – C/C++ float → double n Not always work – Introduced by Rump and reconsidered by Aberth, Precise Numerical Methods Using C++, 1998 87

88

Need at least 37 digits 89

Any Questions? 90

Good, that means we would like to have exercises 91

Exercise 2010/3/25 9: 00 am Email to darby@ee. ncku. edu. tw or hand over in class. You may arbitrarily pick one problem among the first three, which means this exercise contains only five problems. 92

93

94

95

96

97

98

99