FIXED POINT VERSUS FLOATING POINT 1 Outlines Introduction

Outlines Introduction. Fixed point processors. Q Format. Floating point processors. Overflow and scaling. Scaling.

Introduction Digital Signal Processing can be divided into two categories, Fixed point Floating point.

Fixed point processors Represents bits. each number with a minimum of 16 Numbers are

Fixed point processors usages There is simply no need for the floatingpoint processing in

Q Format (Fractional Representation) In a 16 -bit system, it is not possible to

Examples Example (1) Consider two Q 15 format numbers are multiplied what the result

Example (2) Any multiplication or addition resulting in a number larger than 7 or

Example (3) Digital Signal Processing - 8

But…. It should be realized that some precision is lost. As a result of

Floating point Number representation Use a minimum of 32 bits to store each value.

Single precision and double precision C 67 x floating-point data representation. C 67 x

Floating point processors All steps needed to perform floating-point arithmetic are done by the

Floating point processors usages In military radar, the floating point processor is frequently used

Overflow and scaling When multiplying two Q 15 numbers, which are in the range

Scaling The idea of scaling is to scale down the system input before performing

Comparison Digital Signal Processing -16

Conclusion DSP processors are designed as fixed point and floating point. Fixed-point ◦ Partition

Slides: 18

Download presentation

FIXED POINT VERSUS FLOATING POINT 1

Outlines Introduction. Fixed point processors. Q Format. Floating point processors. Overflow and scaling. Scaling. Comparison. Conclusion. Digital Signal Processing - 1

Introduction Digital Signal Processing can be divided into two categories, Fixed point Floating point. Refer to the format used to store and manipulate numbers within the devices. Digital Signal Processing - 2

Fixed point processors Represents bits. each number with a minimum of 16 Numbers are represented and manipulated in integer format. There are four common ways that these 2^16=65, 536 possible bit patterns can represent a number: unsigned integer unsigned fraction notation signed fraction format Digital Signal Processing - 3

Fixed point processors usages There is simply no need for the floatingpoint processing in mobile TVs. Indeed, floating point computations would produce a more precise DCT, Unfortunately, the DCTs in video codec are designed to be performed on a fixed point processor and are bit-exact. Digital Signal Processing - 4

Q Format (Fractional Representation) In a 16 -bit system, it is not possible to represent numbers larger than 32767 and smaller than – 32768. To cope with this limitation, numbers are often normalized between -1, 1. This achieved by moving the implied binary point. Number representations Digital Signal Processing - 5

Examples Example (1) Consider two Q 15 format numbers are multiplied what the result and how will it be stored in 16 bit memory? Digital Signal Processing - 6

Example (2) Any multiplication or addition resulting in a number larger than 7 or smaller than – 8 will cause overflow. When 6 is multiplied by 2, we get 12. The result will be wrapped around the circle to 1100, which is – 4. How to solve this problem? The third example will answer Digital Signal Processing - 7

Example (3) Digital Signal Processing - 8

But…. It should be realized that some precision is lost. As a result of discarding the smaller fractional bits. To solve this problem, the scaling approach will be used (discussed later) Digital Signal Processing - 9

Floating point Number representation Use a minimum of 32 bits to store each value. The represented numbers are not uniformly spaced. Composed of a mantissa and exponent Floating point processor can also support integer representation and calculations. There are two floating-point data representations on the C 67 x processor: single precision (SP) and double precision(DP). Digital Signal Processing -10

Single precision and double precision C 67 x floating-point data representation. C 67 x double precision floating-point representation Digital Signal Processing -11

Floating point processors All steps needed to perform floating-point arithmetic are done by the floating-point hardware. It is inefficient to perform floating-point arithmetic on fixed-point processors , Since all the operations involved, must be done in software. Digital Signal Processing -12

Floating point processors usages In military radar, the floating point processor is frequently used because its performance is essential. Floating point processing is good for doing large FFTs so we can implement the FIR in frequency domain. Appropriate in systems where gain coefficients are changing with time or coefficients have large dynamic ranges. Digital Signal Processing -13

Overflow and scaling When multiplying two Q 15 numbers, which are in the range of – 1 and 1 the product will be in the same range. However, when two Q 15 numbers are added, the sum may fall outside this range, leading to an overflow. Overflows can cause major problems by generating erroneous results. The simplest correction method for overflow is scaling. Digital Signal Processing -14

Scaling The idea of scaling is to scale down the system input before performing any processing then to scale up the resulting output to the original size. Scaling can be applied to most filtering and transform operations. An easy way to achieve scaling is by shifting. Since a right shift of 1 is equivalent to a division by 2, we can scale the input repeatedly by 0. 5 until all overflows disappear. The output can then be rescaled back to the total scaling amount. Digital Signal Processing -15

Comparison Digital Signal Processing -16

Conclusion DSP processors are designed as fixed point and floating point. Fixed-point ◦ Partition a binary word into integer and fractional ◦ Radix point is in a fixed position Floating-point ◦ Large dynamic range ◦ Composed of a mantissa and exponent Scaling solves the problem of overflow. Comparison between fixed point and floating point Digital Signal Processing -17