Systems Architecture I CS 281 001 Lecture 13

  • Slides: 9
Download presentation
Systems Architecture I (CS 281 -001) Lecture 13: Floating Point Arithmetic* Jeremy R. Johnson

Systems Architecture I (CS 281 -001) Lecture 13: Floating Point Arithmetic* Jeremy R. Johnson Wed. May 17, 2000 *This lecture was derived from material in the text (sec. 4. 8). All figures from Computer Organization and Design: The Hardware/Software Approach, Second Edition, by David Patterson and John Hennessy, are copyrighted material (COPYRIGHT 1998 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED). May 17, 2000 Systems Architecture I 1

Introduction • Objective: To provide hardware support for floating point arithmetic. To understand how

Introduction • Objective: To provide hardware support for floating point arithmetic. To understand how to represent floating point numbers in the computer and how to perform arithmetic with them. Also to learn how to use floating point arithmetic in MIPS. • Approximate arithmetic – Finite Range – Limited Precision • Topics – IEEE format for single and double precision floating point numbers – Floating point addition and multiplication – Support for floating point computation in MIPS May 17, 2000 Systems Architecture I 2

Distribution of Floating Point Numbers • 3 bit mantissa • exponent {-1, 0, 1}

Distribution of Floating Point Numbers • 3 bit mantissa • exponent {-1, 0, 1} 0 May 17, 2000 1 2 Systems Architecture I 3 3

Representation of Floating Point Numbers • IEEE 754 single precision 31 30 Sign 23

Representation of Floating Point Numbers • IEEE 754 single precision 31 30 Sign 23 22 Biased exponent 0 Normalized Mantissa (implicit 24 th bit) (-1)s F 2 E-127 May 17, 2000 Systems Architecture I 4

Representation of Floating Point Numbers • IEEE 754 double precision 31 30 Sign 20

Representation of Floating Point Numbers • IEEE 754 double precision 31 30 Sign 20 19 Biased exponent 0 Normalized Mantissa (implicit 53 rd bit) (-1)s F 2 E-1023 May 17, 2000 Systems Architecture I 5

Floating Point Arithmetic • fl(x) = nearest floating point number to x • Relative

Floating Point Arithmetic • fl(x) = nearest floating point number to x • Relative error (precision = s digits) – |x - fl(x)|/|x| 1/2 1 -s for = 2, 2 -s • Arithmetic – x y = fl(x+y) = (x + y)(1 + ) – x y = fl(x y)(1 + ) May 17, 2000 for < u Systems Architecture I 6

Floating Point Addition Algorithm May 17, 2000 Systems Architecture I 7

Floating Point Addition Algorithm May 17, 2000 Systems Architecture I 7

Floating Point Addition Hardware May 17, 2000 Systems Architecture I 8

Floating Point Addition Hardware May 17, 2000 Systems Architecture I 8

Floating Point Multiplication Algorithm May 17, 2000 Systems Architecture I 9

Floating Point Multiplication Algorithm May 17, 2000 Systems Architecture I 9