ENCM 515 Comparison of Integer and Floating Point

Requirements for “perfect” DSP architecture n n Fast instruction cycle -- not clock speed

“Perfect” DSP architecture -- II n n n n Several data buses available to

Integer DSP processors remain popular n n n Around a long time so much

Consider 12 bit A/D n n n n n Double-sided -- -15 V to

Consider 12 bit A/D connected to 32 bit 21061 n n n n n

Examples of integer problems n SIMPLE SMOOTHING Let’s sum up a couple of values

Fractional values -- automatic handling of multiplication shifts Normally 0 xf 0000000 * 0

$Fractional values -- Not all problems removed -- Overflow Understand “fractional” as “fractional full$

This is the standard overflow n n -1 = 0 x 80000000 (16 bits)

MR registers -- 80 bits wide -- 79 63 MR 2 31 MR 1

Set the MR register to 0 11/27/2020 ENCM 515 -- Comparing Floating Point and

Now subtract -( -1 * -1 )from MR MR 2 -- extra sign bits?

$Subtract another -1 (get -2 as 80 bits fractional) MR 2 -- extra sign$

Need to look at a variety of processors n n n n TI 32010

TI 32010 Block Diagram 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors

TI 32010 -- Details 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors

More advanced TMS 320 C 4 X 11/27/2020 ENCM 515 -- Comparing Floating Point

TI 240 -- Block Diagram 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer

TI C 2 XXX Block Diagram 1 11/27/2020 ENCM 515 -- Comparing Floating Point

TI C 2 XXX -- Block Diagram 2 11/27/2020 ENCM 515 -- Comparing Floating

Motorola 56000 Core 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright

Motorola 56300 Integer Processor 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors

Problems with Integer Implementations n Use 8 -bit examples for simplicity n n 16

Solution -- Scaling n n Must prescale all incoming numbers by a value that

Guard Bits -- above and below n n Need to do 8 bit algorithm

Example of Guard Bits n Store with guard bits 127 -- 0 x 7

Reference Source n Following diagrams from Digital Signal Processing Principles, Algorithms and Applications -2

Quantization Error n Suppose you want to develop band-pass or low-pass IIR filter 11/27/2020

Two pole IIR filter 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors

Allowable pole-positions n n OK band-pass BAD low-pass 11/27/2020 ENCM 515 -- Comparing Floating

Coupled form IIR filter 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors

Allowed pole positions 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright

Floating Point Chips n n n n Only scale as necessary Scale automatically Many

Floating point formats on 21 K n n n Three kinds available IEEE Single

What are the allowed numbers? n 32 bit integer n n n Minimum value

Normal 21 k FP example n n Ordinary Decimal 178. 125 Best Integer Approximation

Normal 21 k FP Example continued n Scientific Binary 1. 0110010001 * 2^111 1

Packed 21 k Float n n n See Appendix C Short Float type supports

$Addition in Floating point n 10 + 11 -- stored as (1). frac *$

AMD 29050 FP pipeline 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors

AMD 29050 FP pipeline latency 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer

FP pipeline latency -- software solution 11/27/2020 ENCM 515 -- Comparing Floating Point and

FP pipeline latency -- Hardware Solution 11/27/2020 ENCM 515 -- Comparing Floating Point and

21 K -- Computational Unit 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer

29 K and 21 K Comparison n n 29 K is “general” not DSP

29 K and 21 K Comparison n n Both 29 K and 21 K

FP versus Int processors n Trade algorithm stability and speed/ease of development with cost

New Trends in DSP VLIW 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer

11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 50

11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 51

11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 52

More comments on TIC 67 XX VLIW 11/27/2020 ENCM 515 -- Comparing Floating Point

Tiger SHARC -- Comparison 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors

Tiger SHARC -- Block 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors

Tiger SHARC comments 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright

Tiger SHARC comments -- 2 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer

Looked at a variety of processors n n n n TI 32010 -- Very

Slides: 58

Download presentation

ENCM 515 Comparison of Integer and Floating Point DSP Processors M. Smith, Electrical and Computer Engineering, University of Calgary, Canada smithmr @ ucalgary. ca 11/27/2020 1

Requirements for “perfect” DSP architecture n n Fast instruction cycle -- not clock speed Fast hardware multiplier Floating point for easier design -avoids scaling and overflow High precision n n wide busses for register, memory, processing units Fast loop operation 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 2

“Perfect” DSP architecture -- II n n n n Several data buses available to reduce memory bus conflict/transfer overhead Harvard architecture and/or instruction caches to avoid instruction and data-fetch clashes Duplicate resources for parallel computation Dedicated address calculation hardware Extensive temporary registers to avoid unnecessary fetches of continually used data Architecture allows easy parallel operation in multiprocessor systems -- NEW Cycle time adjustable by instruction -- UNCOMMON Duplicate resources for parallel computation of real and imaginary components -- UNCOMMON -- SIMD? 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 3

Integer DSP processors remain popular n n n Around a long time so much code already development Many designs available Some complications n n Overflow with addition multiplication operations -- 16 bit x 16 bit means 32 bit result where only certain portions are useful Overcome with Fractional Format Overcome with special architecture features 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 4

Consider 12 bit A/D n n n n n Double-sided -- -15 V to nearly +15 V 0 x 800 -0 x. A 00 -0 x. C 00 -0 x. E 00 -0 x 000 -0 x 200 -0 x 400 -- -15 V --11. 25 V --7. 5 V --3. 75 V -0 V 3. 75 V -7. 5 V -- negative full scale three quarter negative scale half negative full scale quarter positive full scale half positive full scale Connect so that negative sign (bit 11) on A/D matches negative sign (bit 31) on 21061 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 5

Consider 12 bit A/D connected to 32 bit 21061 n n n n n Double-sided -- -15 V to nearly +15 V 0 x 80000000 -0 x. A 0000000 -0 x. C 0000000 -0 x. E 0000000 -0 x 20000000 -0 x 40000000 -- -15 V -- negative full scale -11. 25 -- three quarters negative -7. 5 V -half negative full scale -3. 75 V -- quarter negative 0 V 3. 75 V -- quarter positive 7. 5 V -- half positive full scale Connected so that negative sign (bit 11) on A/D matches negative sign (bit 31) on 21061 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 6

Examples of integer problems n SIMPLE SMOOTHING Let’s sum up a couple of values around -7. 5 V and calculate an average n n n 0 x. A 100000 + 0 x. A 1000002 + …… Overflow VERY SIMPLE FIR FILTER n n n Result = V 1 * H 1 + V 2 * H 2 Let V 1 = 0 x. A 1000000 (32 bits) Let H 1 = 0 x 8 (3 bits) Need 35 bits to keep result What do the 35 bits mean? -- need to scale 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 7

Fractional values -- automatic handling of multiplication shifts Normally 0 xf 0000000 * 0 xf 0000000 would result in 64 bit values which would then need scaling 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 8

$Fractional values -- Not all problems removed -- Overflow Understand “fractional” as “fractional full$

Fractional values -- Not all problems removed -- Overflow Understand “fractional” as “fractional full scale” Okay when multiply (R 7) but look at R 6 = -1 + -1 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 9

This is the standard overflow n n -1 = 0 x 80000000 (16 bits) -1 + -1 = 0 x 80000000 + 0 x 80000000 --------0 x 10000 (17 bits) Can expect to overflow in the middle of integer FIR filter, although final result should be in range -1. 0 to +1. 0 if filter gain is less than 1. Must handle intermediate results overflowing 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 10

MR registers -- 80 bits wide -- 79 63 MR 2 31 MR 1 0 MR 1 -- acts just the same R register in fractional mode MR 2 -- OVERFLOW -- looks after the problems of -1 * -1 MR 0 -- UNDERFLOW -- looks after problems of -1 / 65000 Works till have to get values out of MR -Okay ENCM 515 in FIR-- Comparing (important stuff in MR 1) Floating Point and Integer Processors 11/27/2020 Copyright smithmr@ucalgary. ca 11

Set the MR register to 0 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 12

Now subtract -( -1 * -1 )from MR MR 2 -- extra sign bits? MR 1 -- looks like ENCM 515 R 0 -- Comparing Floating Point and Integer Processors 11/27/2020 Copyright smithmr@ucalgary. ca 13

$Subtract another -1 (get -2 as 80 bits fractional) MR 2 -- extra sign$

Subtract another -1 (get -2 as 80 bits fractional) MR 2 -- extra sign bits? MR 1 -- looks like ENCM 515 R 6 -- Comparing Floating Point and Integer Processors 11/27/2020 Copyright smithmr@ucalgary. ca 14

Need to look at a variety of processors n n n n TI 32010 -- Very early integer DSP TI 32 C 240 -- Later integer DSP Motorola 56000 -- Popular integer DSP AMD 29050 series (RISC with some DSP) Analog SHARC 2106 X Motorola C 6701 -- VLIW Analog Tiger. SHARC -- VLIW 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 15

TI 32010 Block Diagram 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 16

TI 32010 -- Details 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 17

More advanced TMS 320 C 4 X 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 18

TI 240 -- Block Diagram 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 19

TI C 2 XXX Block Diagram 1 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 20

TI C 2 XXX -- Block Diagram 2 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 21

Motorola 56000 Core 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 22

Motorola 56300 Integer Processor 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 23

Problems with Integer Implementations n Use 8 -bit examples for simplicity n n 16 bit will have same problem 8 bit A/D for real time operations 8 bit processor Average 4 values n n 1, 2, 3, 4 -- answer will be correct = 5 127, 2, 3, 4 -- answer incorrect = -60 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 24

Solution -- Scaling n n Must prescale all incoming numbers by a value that guarantees that no overflow occurs. Do process then rescale Add 2 numbers -- ASR 1 - scale by 2 Add 4 numbers -- ASR 2 - scale by 4 Average 4 values n 1, 2, 3, 4 -- scaled by 4 -- 0, 0, 0, 1 n n average = 0 -- accurate answer to 2 bits 127, 2, 3, 4 -- scaled -- 32, 0, 0, 1 n answer = 32 -- accurate answer to 2 bits 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 25

Guard Bits -- above and below n n Need to do 8 bit algorithm in 16 bit processor Use 4 guard bits below and 4 above Still need to prescale, but not by as much Example n n n adding 4 numbers -- no prescale adding 16 numbers -- no prescale? Adding 32 numbers -- prescale 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 26

Example of Guard Bits n Store with guard bits 127 -- 0 x 7 F -- stored as 0 x 07 F 0 2 -- 0 x 02 -- stored as 0 x 0020 3 -- 0 x 03 -- stored as 0 x 0030 4 -- 0 x 04 -- stored as 0 x 0040 Sum stored as 0 x 0880 Average stored as 0 x 0220 = 34 n FIR type sum may involves 128 terms n n n 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 27

Reference Source n Following diagrams from Digital Signal Processing Principles, Algorithms and Applications -2 nd addition Proakis and Manolakis, Mc. Millian Publishing, 1992 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 28

Quantization Error n Suppose you want to develop band-pass or low-pass IIR filter 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 29

Two pole IIR filter 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 30

Allowable pole-positions n n OK band-pass BAD low-pass 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 31

Coupled form IIR filter 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 32

Allowed pole positions 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 33

Floating Point Chips n n n n Only scale as necessary Scale automatically Many other advantages Many formats of floats Some are high precision and slow Some are low precision and fast Some are as high precision as possible given the speed Round up, round down etc 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 34

Floating point formats on 21 K n n n Three kinds available IEEE Single Precision -- normal operations -32 -bit format -- Also extended 40 -bit format Short Word Floating Point -- special 21 K feature -- 16 -bit format n n n Used to create IIR delay lines as use less memory Special memory location for storage Special instructions -- Fpack and Funpack 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 35

What are the allowed numbers? n 32 bit integer n n n Minimum value is -2^31 Maximum value is +2^31 - 1 Smallest value is 1 Granularity of 1 32 bit floating point n n Maximum value is +2^+127 Minimum value is -2^+127 Smallest value is 2^-127 Granularity -- changes n -- fine for small number, coarse for large 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 36

Normal 21 k FP example n n Ordinary Decimal 178. 125 Best Integer Approximation 178 Scientific Decimal 1. 78125 * 10^2 Scientific Binary 1. 0110010001 * 2^111 1 bit -- sign 8 bits -- for unsigned magnitude biased exponent 24 bits -- for fractional part Total 33 bits of storage 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 37

Normal 21 k FP Example continued n Scientific Binary 1. 0110010001 * 2^111 1 bit -- sign 8 bits -- for unsigned magnitude biased exponent (+127) 24 bits -- for fractional part -- the 1. Is “James Bonded” -- “remembered not stored” -- need 23 bits Total 33 bits of storage Biased exponent (1). 0110010001 * 2^10000110 sign = 0 biased exponent = 10000110 fractional part = 0110010001 hidden 1. For normalized numbers n 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 38

Packed 21 k Float n n n See Appendix C Short Float type supports gradual underflow Sacrifices precision for dynamic range Largest number 2 ^ 135 Smallest “Accurate” Number 2 ^ 120 Smallest “Non-zero” number 2 ^ 110 Must scale numbers appropriately 1 bit -- sign 4 bit -- (binary exponent - 120) 11 bit -- rounded upper 11 bits of source OR 11 bit represents non-normalized form of the source when exponent stored as 0 ENCM 515 -- Comparing Floating Point and Integer Processors 11/27/2020 Copyright smithmr@ucalgary. ca 39

$Addition in Floating point n 10 + 11 -- stored as (1). frac *$

Addition in Floating point n 10 + 11 -- stored as (1). frac * 2^N (1). 010 * 2^3 + (1). 011 * 2^3 = 10. 101 * 2^3 = (1). 0101 * 2^4 -- must renormalize n 10 + 20 (1). 010 * 2^3 + (1). 010 * 2^4 = 0. 1010 * 2^4 + 1. 010 * 2^4 -- denormalize = (1). 1110 * 2^4 n n n Remember that (1) is “magic” or remembered and is not stored Can all be done using integer instructions -- around 280 instructions per FOP Problems with co-processor -- data moves 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 40

AMD 29050 FP pipeline 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 41

AMD 29050 FP pipeline latency 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 42

FP pipeline latency -- software solution 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 43

FP pipeline latency -- Hardware Solution 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 44

21 K -- Computational Unit 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 45

29 K and 21 K Comparison n n 29 K is “general” not DSP 29 K and 21 K are both Super-scalar structurally n n 21 K is super-scalar instructionally 29 K has two important “superscalar features” in terms of instructions n n n FMAC which is 2 instructions on 21 K (1 in integer) 192 registers on 29 K -- no need to do dm( ) and pm( ) access since already in registers! FMAC gives 29 K tremendous speed advantage 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 46

29 K and 21 K Comparison n n Both 29 K and 21 K can complete new FADD every cycle -- BUT 29 K FADD 7 -stage pipeline at 50 MHz is FETCH DECODE Denormalize, Add, Perhaps Renormalize, Round WRITEBACK n 21 K FADD 3 -stage pipeline at 40 MHZ FETCH DECODE EXECUTE/WRITEBACK 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 47

FP versus Int processors n Trade algorithm stability and speed/ease of development with cost n n n Cost is rapidly changing FP has “less baggage” in terms of legacy code Now VLIW (true) on DSP and VLIW (effective) on standard Intel and AMD stuff 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 48

New Trends in DSP VLIW 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 49

11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 50

11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 51

11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 52

More comments on TIC 67 XX VLIW 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 53

Tiger SHARC -- Comparison 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 54

Tiger SHARC -- Block 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 55

Tiger SHARC comments 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 56

Tiger SHARC comments -- 2 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 57

Looked at a variety of processors n n n n TI 32010 -- Very early integer DSP TI 32 C 240 -- Later integer DSP Motorola 56000 -- Popular integer DSP AMD 29050 series (RISC with some DSP) Analog SHARC 2106 X Motorola C 6701 -- VLIW Analog Tiger. SHARC -- VLIW 11/27/2020 ENCM 515 -- Comparing Floating Point and Integer Processors Copyright smithmr@ucalgary. ca 58