Integer Multipliers 1 Multipliers A must have circuit
- Slides: 139
Integer Multipliers 1
Multipliers • A must have circuit in most DSP applications • A variety of multipliers exists that can be chosen based on their performance • Serial, Serial/Parallel, Shift and Add, Array, Booth, Wallace Tree, …. 2
en en en reset converter reset RA converter 16 x 16 multiplier RC Converter RB 3
Multiplication Algorithm X= Xn-1 Xn-2 ………. . ……X 0 Multiplicand Y=Yn-1 Yn-2………………. Y 0 Multiplier Yn-1 X 0 Yn-2 X 0 Yn-3 X 0 …… Y 1 X 0 Y 0 X 0 Yn-1 X 1 Yn-2 X 1 Yn-3 X 1 …… Y 1 X 1 Y 0 X 1 Yn-1 X 2 Yn-2 X 2 Yn-3 X 2 …… Y 1 X 2 Y 0 X 2 … … …. …. Yn-1 Xn-2 Yn-2 X 0 n-2 Yn-3 X n-2 …… Y 1 Xn-2 Y 0 Xn-2 Yn-1 Xn-1 Yn-2 X 0 n-1 Yn-3 Xn-1 …… Y 1 Xn-1 Y 0 Xn-1 -------------------------------------------------------------------- P 2 n-1 P 2 n-2 P 2 n-3 P 2 P 1 P 0 4
1. Multiplication Algorithms Implementation of multiplication of binary numbers boils down to how to do the additions. Consider the two 8 bit numbers A and B to generate the 16 bit product P. First generate the 64 partial Products and then add them up. 5
Multiplier Design Storage R E G I N R E G MU ( Multiplier Unit) O U T Control Unit 6
Serial Multiplier X: x 3 x 2 x 1 x 0 Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Reset: 01000010000 Slide 1 7
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Reset: 01000010000 Slide 2 8
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Reset: 01000010000 Slide 3 9
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Reset: 01000010000 Slide 4 10
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Reset: 01000010000 Slide 5 11
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Reset: 01000010000 Slide 6 12
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 7 13
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 8 14
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 9 15
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 10 16
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 11 17
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 12 18
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 13 19
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 14 20
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 15 21
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 16 22
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 17 23
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 18 24
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 19 25
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 20 26
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 21 27
X: x 3 x 2 x 1 x 0 Si: the ith bit of the final result Y: y 3 y 2 y 1 y 0 Input Sequence for G 1: Ci: the only carry from column i 00 x 3 x 2 x 1 x 00 x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 Sij: the jth partial sum for column i 00 y 3 y 3 0 y 2 y 2 0 y 1 y 1 0 y 0 y 0 Cij: the jth partial carry from column i Reset: 01000010000 Slide 21 28
Si: the ith bit of the final result Serial / Parallel Multiplier Slide 1 29
Si: the ith bit of the final result Ci: the only carry from column i Slide 2 30
Si: the ith bit of the final result Ci: the only carry from column i Sij: the jth partial sum for column i Cij: the jth partial carry from column i Slide 3 31
Si: the ith bit of the final result Ci: the only carry from column i Sij: the jth partial sum for column i Cij: the jth partial carry from column i Slide 4 32
Si: the ith bit of the final result Ci: the only carry from column i Sij: the jth partial sum for column i Cij: the jth partial carry from column i Slide 5 33
Si: the ith bit of the final result Ci: the only carry from column i Sij: the jth partial sum for column i Cij: the jth partial carry from column i Slide 6 34
Si: the ith bit of the final result Ci: the only carry from column i Sij: the jth partial sum for column i Cij: the jth partial carry from column i Slide 7 35
Si: the ith bit of the final result Ci: the only carry from column i Slide 8 36
Shift AND Add Multiplier INPUT Ain (7 downto 0) REGA 0 MUX 8 bit Adder INPUT Bin (7 downto 0) REGC Result (15 downto 8) REGB Result (7 downto 0) CLOCK 37
Synchronous Shift and Add Multiplier controller Ø Multiplication process: § 5 states: Idle, Init, Test, Add, and Shift&Count. § Idle: Starts by receiving the Start signal; § Init: Multiplicand multiplier are loaded into a load register and a shift register, respectively; § Test: The LSB in the shift register which contains the multiplier is tested to decide the next state; 38
Synchronous Shift and Add Multiplier Controller. Design § Add: If LSB is ‘ 1’, then next state is to add the new partial product to the accumulation result, and the state machine transits to shift&count state ; § Shift&Count: If LSB is ‘ 0’, then the two shift register shift their contains one bit right, and the counter counts up by one step. After that, the state machine transits back to test state; § When the counter reaches to N , a Stop signal is asserted and the state machine goes to the idle state; § Idle: In the idle state, a Done signal is asserted to indicate the end of multiplication. 39
n-bit Multiplier: Q 0=1: Multiplicand is added to register A; the result is stored in register A; registers C, A, Q are shifted to the right one bit Q 0=0: Registers C, A, Q are shifted to the right one bit Slide 1 40
Example: 4 -bit Multiplier Initial Values Slide 2 41
Example: 4 -bit Multiplier First Cycle--Add Slide 3 42
Example: 4 -bit Multiplier First Cycle--Shift Slide 4 43
Example: 4 -bit Multiplier Second Cycle--Shift Slide 5 44
Example: 4 -bit Multiplier Third Cycle--Add Slide 6 45
Example: 4 -bit Multiplier Third Cycle--Shift Slide 7 46
Example: 4 -bit Multiplier Fourth Cycle--Add Slide 8 47
Example: 4 -bit Multiplier Fourth Cycle--Shift Slide 9 48
4*4 Synchronous Shift and Add Multiplier Design Layout Design Floor plan of the 4*4 Synchronous Shift and Add Multiplier 49
Comparison between Synchronous and Asynchronous Approaches . 50
Example : (simulated by Ovais Ahmed) Multiplicand = 100010012 = 8916 Multiplier = AB 16 101010112 = Expected Result = 1011011100000112 =5 B 8316 51
Array Multiplier · Regular structure based on add and shift algorithm. · Addition is mainly done by carry save algorithm. · Sign bit extension results in a higher capacitive load and slows down the speed of the circuit. 52
Addition with CLA 53
Array Multiplier with CSA 54
Critical Path with Array Multipliers FA FA HA FA FA HA HA Two of the possible paths for the Ripple-Carry based 4*4 Multiplier Area = (N*N) AND Gate + (N-1)N Full-Adder Delay = τ HA + (2 N-1) τ FA 55
56
Wallace Tree 57
Array Multiplier + Wallace Tree 58
Baugh-Wooley Algorithm Convert negative partial products to positive representation • No sign-extension required 2/23/2021 Concordia VLSI Lab 59 59
examples of 5 -by-5 Baugh-Wooley 2/23/2021 Concordia VLSI Lab 60 60
Squarer using Baugh-Wooley Algorithm a 7 a 6 a 5 a 4 a 3 a 2 a 1 a 0 * a 7 a 6 a 5 a 4 a 3 a 2 a 1 a 0 ------------ ------------ ------ a 7*a 0 a 6*a 0 a 5*a 0 a 4*a 0 a 3*a 0 a 2*a 0 a 1*a 0 a 0*a 0 a 7*a 1 a 6*a 1 a 5*a 1 a 4*a 1 a 3*a 1 a 2*a 1 a 1*a 1 a 0*a 1 a 7*a 2 a 6*a 2 a 5*a 2 a 4*a 2 a 3*a 2 a 2*a 2 a 1*a 2 a 0*a 2 a 7*a 3 a 6*a 3 a 5*a 3 a 4*a 3 a 3*a 3 a 2*a 3 a 1*a 3 a 0*a 3 a 7*a 4 a 6*a 4 a 5*a 4 a 4*a 4 a 3*a 4 a 2*a 4 a 1*a 4 a 0*a 4 a 7*a 5 a 6*a 5 a 5*a 5 a 4*a 5 a 3*a 5 a 2*a 5 a 1*a 5 a 0*a 5 a 7*a 6 a 6*a 6 a 5*a 6 a 4*a 6 a 3*a 6 a 2*a 6 a 1*a 6 a 0*a 6 a 7*a 7 a 6*a 7 a 5*a 7 a 4*a 7 a 3*a 7 a 2*a 7 a 1*a 7 a 0*a 7 ------------ ------------ ------------ ------------ ------a 4*a ------------ ------ 61
Example of an 8 bit squarer 62
Array Multiplier 32 bits by 32 bits multiplier 63
Booth (Radix-4) Multiplier · Radix-4 (3 bit recoding) reduces number of partial products to be added by half. · Great saving in area and increased speed. A = -an-12 n-1 + an-22 n-2 + an-32 n-3 + …. + a 12 + a 0 B = -bn-12 n-1 + bn-22 n-2 + bn-32 n-3 + …. + b 12 + b 0 · Base 4 redundant sign digit representation of B is (n/2) - 1 B = 22 i Ki i = 0 64
· · Ki is calculated by following equation Ki = -2 b 2 i+1 + b 2 i-1 i = 0, 1, 2, …. (n-2)/2 · 3 bits of Multiplier B, b 2 i+1, b 2 i-1, are examined and corresponding Ki is calculated. · B is always appended on the right with zero (b-1 = 0), and n is always even (B is sign extended if needed). · The product A B is then obtained by adding n/2 partial products. (n/2) - 1 A B = P = 22 i Ki A i = 0 65
Booth Algorithm Decoding of multiplier to generate signals for hardware use Xi+1 Xi Xi-1 OP NEG ZERO TWO 0 0 0 1 0 0 2 1 0 1 0 0 0 1 1 0 0 0 1 1 1 0 0 0 1 1 2 0 0 1 1 0 66
Booth Algorithm A Booth recoded multiplier examines Three bits of the multiplicand at a time It determine whether to add zero, 1, -1, 2, or -2 of that rank of the multiplicand. The operation to be performed is based on the current two bits of the multiplicand the previous bit Xi+1 X Xi-1 Zi/2 0 0 0 1 1 0 1 0 1 1 2 1 0 0 -2 1 0 1 -1 1 1 0 -1 1 0 67
BIT M is multiplied by 21 20 2 -1 Xi Xi+1 Xi+2 0 0 0 add zero (no string) +0 0 0 1 add multipleic (end of string) +X 0 1 0 add multiplic. (a string) +X 0 1 1 add twice the mul. (end of string) +2 X 1 0 0 sub. twice the m. (beg. of string) -2 X 1 0 1 sub. the m. (-2 X and +X) -X 1 1 0 sub. the m. (beg. of string) -X 1 1 1 sub. zero (center of string) -0 OPERATION 68
Booth Algorithm- dot notation Multiplicand A = ● ● Multiplier B = (●●) Partial product bits ● ● (B 1 B 0)2 A 40 Partial product bits ● ● (B 3 B 2)A 41 Product P = ● ● ● ● 69
Example The following example is used to show the calculation is done properly. Added to Multiplicand X = 000011 the multiplier Multiplier Y = 011101 0 1 1 1 0 After booth decoding, Y is decoded as to multiply X by +2, -1, +1 separately, then shift the partial product two bits and add them together. X* +1 0000011 X* -1 111101 X* +2 00000110 ---------------------- 000001010111 70
Sign Extension 71
Sign extension § Traditional sign-extension scheme • Segment the input operands based on the size of embedded blocks • Multiply the segmented inputs and extend the sign bit of each partial products • Sum all partial products Segmented input operands × Sign extension partial products + Sign 2/23/2021 Final result Concordia VLSI Lab 72 72
Booth Algorithm-Example 1 Example 1: 73
Booth Algorithm Example 2 Notice sign extensions 74
Booth Algorithm-Example 3 Notice the sign extensions 75
Comparison of Booth and parallel multiplier shift and Add 76
Template to reduce sign extensions for Booth Algorithm Please note that each operand is 17 bit ie. the 17 th bit is the sign bit. Also negative numbers are entered as 1’s complement, this is why you need to add the S in the right hand side of the diagram. If you use 2’complement then the S’s on right side of the diagram can be removed 77
Comparison of Template and the sign extension 78
3 3 3 2 2 2 2 2 1 1 1 1 1 9 8 7 6 5 4 1 0 3 2 2 1 0 9 8 7 6 5 4 3 2 1 0 S S S A A A A A 0 0 0 0 0 1 S A A A A A 1 1 1 1 1 1 S A A A A A 2 2 2 2 2 1 S A A A A A 3 3 3 3 3 1 S A A A A A 4 4 4 4 4 1 S A A A A A 5 5 5 5 5 1 S A A A A A 6 6 6 6 6 A A A A A 7 7 7 7 7 Partial Product matrix generated for a 16 * 16 bit multiplication, Using booth and the template given in previous slide 7 S A A A A 8 8 8 8 8 79
Example of using the template 25 * - 35 with -35 as the multiplier. Using 8 bit representation Using the Template 25 * -35 Sign bit 0 0 0 1 1 0 0 1 Add SS 1 1 0 1 0 Add inverted S Add Inverted sign and add 1 1 0 0 0 1 1 0 0 1 * 1 Add Inverted sign bit 1 0 1 1 1 0 0 1 1 1 * -1 1 0 0 1 0 * 2 No sign bit 1 1 0 0 1 1 1 * -1 1 1 0 0 1 0 1 This is a –ve number. Convert it 0 0 1 1 0 1 0 1 1 512 256 64 32 8 2 1 = 875 80
Booth Multiplier Components Multiplier Booth Encoder Mu lt ip li ca nd PPU (Partial products unit) PPA (Partial products adding unit) Product 81
Wallace Tree and Ripple Carry Adder Structure. Of 8*8 multiplier With Pipeline 82
Hardware implementation of Booth with shift and add 83
Simulation Plan 84
Testing the Design 85
Simulation For Parallel Multipliers Signed Number: Unsigned Number: 86
Simulation For Signed S/P Multipliers There are 340 ns delay between the result and the operators because of the D flip-flops delay. 87
FPGA after implementation, areas of programming shown clearly 88
Another implementation of the above after pipelining, the place and rout has paced the design in different places. 89
Spartacus FPGA board 90
Testing the multiplication system 91
Comparison of Multipliers Array Multiplier Area – Total CLB’s (#) Modified Booth Multiplier Wallace-Tree Multiplier Modified Booth. Wallace Tree Multiplier Twin Pipe Serial -Parallel Multiplier Behavioral Multiplier 3076. 50 2649. 50 3325. 50 2672. 50 490. 00 2993. 50 Maximum Delay D(ns) 35. 78 24. 43 18. 93 18. 53 107. 52 (3. 36 x 32) 49. 33 Total Dynamic Power P (W) 7. 52 6. 33 7. 46 6. 41 0. 28 6. 24 Delay ·Power Product (DP) (ns W) 268. 98 154. 64 141. 14 118. 76 30. 62 307. 58 Area • Power Product (AP) (# W) 23128. 20 16771. 60 24793. 93 17127. 79 139. 54 18665. 07 Area • Delay Product (AD) (# ns) 1. 10 E+05 6. 47 E+04 6. 30 E+04 4. 95 E+04 5. 27 E+04 1. 48 E+05 3. 94 E+06 1. 58 E+06 1. 19 E+06 9. 18 E+05 5. 66 E+06 7. 28 E+06 Area • Delay 2 Product (AD 2) (# ns 2) 92 Table 7. Performance comparison for two’s complement multipliers By Chen Yaoquan, M. Eng. 2005
Comparison of Multipliers Array Multiplier Area – Total CLB’s (#) Modified Booth Multiplier Wallace-Tree Multiplier Modified Booth. Wallace Tree Multiplier Twin Pipe Serial. Parallel Multiplier Behavioral Multiplier 3280. 50 2800. 00 3321. 50 2845. 50 487. 00 3003. 00 37. 23 25. 33 18. 93 18. 33 107. 52 44. 50 Total Dynamic Power P (W) 7. 57 6. 66 7. 32 6. 66 0. 29 6. 26 Delay ·Power Product (DP) (ns W) 281. 88 168. 77 138. 60 122. 13 30. 66 278. 53 Area • Power Product (AP) (# W) 24837. 98 18656. 40 24319. 36 18959. 57 138. 89 18795. 78 Area • Delay Product (AD) (# ns) 1. 22 E+05 7. 09 E+04 6. 29 E+04 5. 22 E+04 5. 24 E+04 1. 34 E+05 4. 55 E+06 1. 80 E+06 1. 19 E+06 9. 56 E+05 5. 63 E+06 5. 95 E+06 Maximum Delay D(ns) Area • Delay 2 Product (AD 2) (# ns 2) 93 Table 7. Performance comparison for Unsigned multipliers By Chen Yaoquan, M. Eng. 2005
Comparison of Multipliers Change the value of “set_max_delay” in Script file (ns) 0 10 20 30 40 50 60 >60 3013. 0 3110. 0 3193. 5 3019. 5 2999. 5 2978. 5 Power(w ) 6. 649 6. 647 7. 568 9 0 3 8. 187 8 8. 064 5 8. 041 9 8. 015 6 Delay(n s) 31. 98 30. 08 39. 93 49. 88 59. 63 Area(#) 3014. 5 31. 98 30. 93 The relation of Area and Delay for behavioral multiplier -"banana curve" 94
Comparison of Multipliers Array Multiplier Modified Booth Multiplier Wallace. Tree Multiplier Modified Booth. Wallace Tree Multiplier Twin Pipe Serial. Parallel Multiplier Behavioral Multiplier Area Medium Small Large Smallest Medium Critical Delay Medium Fast Very Fastest Very Large Power Consumption Large Medium Smallest Medium Complexity Simple Complex More Complex Simplest Implement Easy Medium Difficut Easy Easiest By Chen Yaoquan, M. Eng. 2005 95
Pipelining Simulation 96
Synthesis for Signed Multipliers Array Modified Booth Wallace Tree Modified Booth -Wallace Tree Twin Pipe S/P Behavioral 97
Synthesis for Unsigned Multipliers Array Modified Booth Wallace Tree Modified Booth -Wallace Tree Twin Pipe S/P Behavioral 98
Conclusion • • Modified Booth and Wallace Tree are the best techniques for high speed multiplication. Wallace Tree has the best performance, but it is hard to implement. Booth algorithm based multipliers have lower area among parallel multipliers. For behavioral multipliers, the area will increase while the delay decreases. 99
Comparison Area – Total CLB’s (#) Maximum Delay (ns) Power Consumption at highest speed (m. W) Delay Power Product (DP) (ns m. W) Area Power Product (AP) (# m. W) Area Delay Product (AD) (# ns) Area Delay 2 Product(AD 2) (# ns 2) Array Multiplier 1165 187. 87 ns Modified Booth Multiplier 1292 139. 41 ns 16. 6506 m W (at 188 ns) 3128. 15 Wallace Tree Multiplier 1659 101. 14 ns Modified Booth & Wallace Tree Multiplier 1239 Twin Pipe Serial. Parallel Multiplier 133 101. 43 ns 22. 58 ns (722. 56 ns) 23. 136 m. W (at 140 ns) 30. 95 m. W (at 101. 14 ns) 30. 862 m. W (at 101. 43 ns) 2. 089 m. W (at 722. 56 ns) 3225. 39 3130. 28 3130. 33 1509. 42 19. 397 x 103 218. 868 x 103 29. 891 x 103 51. 346 x 103 38. 238 x 103 277. 837 180. 118 x 103 167. 791 x 103 125. 671 x 103 96. 101 x 103 25. 110 x 106 16. 970 x 106 12. 747 x 106 69. 438 x 106 41. 119 x 106 100
NOTICE · The rest of these slides are for extra information only and are not part of the lecture 101
Array Addition 102
Addition of 8 binary numbers using the Wallace tree principal 103
104
105
106
Baugh-Wooley two's complement multiplier: • 107
108
Cluster Multipliers Divide the multiplier into smaller multipliers 109
Cluster Multipliers The circuit used to generate the enable signal 110 8 -bit cluster low power multiplier
Cluster Multipliers • Dividing the multiplication circuit into clusters (blocks) of smaller multipliers • Applying clock gating techniques to disable the blocks that are producing a zero result. • Features – Low Power (claims 13. 4 % savings) 111
Multiplexer-Based Array Multipliers Zj xjyj 112
Multiplexer-Based Array Multipliers Two types of cells: Cell 1: produce the terms carry save adder array Z ij 2 j and includes a full adder of Cell 2: produce the terms xjyj 2 j and includes a full adder of carry save adder array 113
Multiplexer-Based Array Multipliers • Characteristics – Faster than Modified Booth – Unlike Booth, does not require encoding logic – Requires approximately N 2/2 cells – Has a zigzag shape, thus not layout-friendly 114
Multiplexer-Based Array Multipliers • Improvement – More rectangular layout – Save up to 40 percent area without penalties – Outperforms the modified Booth multiplier in both speed and power by 13% to 26% 115
Gray-Encoded Array Multiplier Dec Hyb Dec Hyb 0 0000 4 0100 -8 1100 -4 1000 1 0001 5 0101 -7 1101 -3 1001 2 0011 6 0111 -6 1111 -2 1011 3 0010 7 0110 -5 1110 -1 1010 • 2’s complement Hybrid Coding – Having a single bit different for consecutive values – Reducing the number of transitions, and thus power ( for highly correlated streams ). 116
Gray-Encoded Array Multiplier An 8 -bit wide 2’s complement radix-4 array multiplier 117
Gray-Encoded Array Multiplier • Characteristics – Uses gray code to reduce the switching activity of multiplier – Saves 45. 6% power than Modified Booth – Uses greater area(26. 4% ) than Modified Booth 118
Ultra-high Speed Parallel Multiplier • How to ultra-high speed? – Based on Modified Booth Algorithm and Tree Structure (Column compress) – Chooses efficient counters (3: 2 and 5: 3) – Uses the new compressor (faster 20% ) – Uses First Partial product Addition (FPA) Algorithm (reducing the bits of CLA by 50%) 119
Ultra-high Speed Parallel Multiplier Divide into 3 rows or 5 rows only (most efficient). Calculate the partial products as soon as possible. The final CLA is only 16 -bit instead of 32 -bit. Calculation process using parallel counter in case of 16 x 16 ---Totally reduce delay by about 30% 120
ULLRLF Multiplier • ULLRLF stands for Upper/Lower Left-to. Right Leapfrog. • Combine the following techniques: – Signal flow optimization in [3: 2] adder array for partial product reduction, – Left-to-right leapfrog (LRLF) signal flow, – Splitting of the reduction array into upper/lower parts. 121
ULLRLF Multiplier PPij is always connected to pin A Sin/Cin are connected to B/C , most Sin signals are connected to C 1) Signal flow optimization in [3: 2] adder array -- For n = 32, the delay is reduced by 30 percent. -- The power is saved also. 122
ULLRLF Multiplier The sum signals skip over alternate rows. 2) Left-to-Right Leapfrog (LRLF) Structure -- The delay of signals is more balanceable. -- Low power. 123
ULLRLF Multiplier Only n+2 bits 3) Upper/Lower Split Structure -- The long path of data path be broken into parallel short paths, there would be a saving in power. -- The delay of Partial Products Reduction is reduced. 124
ULLRLF Multiplier • ULLRLF multipliers have less power than optimized tree multipliers for n ≤ 32 while keeping similar delay and area. • With more regularity and inherently shorter interconnects, the ULLRLF structure presents a competitive alternative to tree structures. Floorplan of ULLRLF (n = 32)125
Signed Array Multiplier 126
Unsigned Array Multiplier 127
Signed Modified Booth Multiplier 128
Signed Modified Booth Multiplier 129
Unsigned Modified Booth Multiplier 130
Unsigned Modified Booth Multiplier 131
Wallace Tree multipliers 132
Wallace Tree multipliers • Use the 3: 2 counters and 2: 2 counters • Number of levels of = log (32/2) / log (3/2) ≈8 • Irregular structure • Fast 133
Wallace Tree multipliers 2 -level hierarchical 134
Modified Booth-Wallace Tree Multipliers 135
Modified Booth-Wallace Tree Multipliers • Use the 3: 2 counters and 2: 2 counters • Number of levels of = log (16/2) / log (3/2) ≈6 • Irregular structure • Fast • Less area 136
Twin pipe serial-parallel multipliers 137
Signed twin pipe serial-parallel multipliers “Sign” control line and the sign-change hardware 138
Unsigned twin pipe serial-parallel multipliers • Don’t need the “Sign” control line and the sign-change hardware 139
- Lets review
- Every circuit
- Undetermined multipliers
- Madas calculator
- Two unit multipliers
- Unit multipliers
- Binary multipliers
- Equivalent fractions and multipliers unit 1 lesson 3
- Equivalent fractions and multipliers
- A 60 g sample of tetraethyl lead
- Lagrange multiplier
- Play magic multiplier
- Prefix multipliers
- 8 vertices 6 faces 12 edges
- He must become greater; i must become less
- A feeder consists of all circuit conductors located
- V
- Different types of circuits
- Parallel circuit circuit construction kit
- Series vs parallel current
- What is a incomplete circuit
- Short circuit schematic diagram
- Venn diagram of series and parallel circuit
- Circulatory system diagram
- What is a parallel circuit in physics
- 11-sentence paragraph examples
- Timestamps must have following properties namely
- Must should ought to
- Prohibition modals
- Modal verbs can may must
- A good friend must be
- I have no words & i must design
- Generalist intervention model example
- Every living plants and animals must have
- You must unlearn what you have learned
- Analogy in the most dangerous game
- We skated on the frozen lake. this phrase is a(n) _____.
- Something must have gone wrong
- Every open sided floor or platform
- Esercizi must have to
- Should y must
- A narrative must have characters
- Interrelated data means
- Everybody has won and all must have prizes
- The ideal working fluid for the vapor cycle must have
- All cells must contain
- I must have drunk four cups of cocoa
- Every flight of stairs with four or more risers
- A sentence fragment may be corrected by
- Are we there yet she asked
- Use of must and have to
- Select the correctly punctuated sentence
- A series circuit cannot have
- Lesson 2 add integers page 207 answers
- Integer exponent rules
- Integers
- Overflow cc
- Operator integer
- N++++
- Integer overflow attack
- Which graph represents a function with direct variation?
- Constant integer
- Integer adalah
- Var a b integer
- 1 var
- Integer denominator
- Var x integer
- Greatest integer function ti 84
- Natural number
- Integer programming vs linear programming
- Linear vs integer programming
- R integer division
- Integer in vba
- Does integer contain decimal
- Properties of integer
- Definition of integer
- Integer values
- Adding and subtracting integers jeopardy
- Integer football game
- Properties of integer exponents
- Integer programming course
- Integer division c++
- Flowchart adalah
- Bucles anidados
- Deep neural networks and mixed integer linear optimization
- Integer real string boolean
- Contoh program integer
- Dancing the stars
- Consecutive integer problems
- Integers
- Integer exponents
- Integer pipeline stages of pentium processor
- Integer vitae scelerisque purus
- Integer arithmetic adalah
- 4 consecutive integers
- Is a whole number an integer
- Adding and subtracting rational numbers calculator
- 7-1 integer exponents
- Integer exponents
- Lesson 7-1 integer exponents
- 6-1 integer exponents
- 6-1 integer exponents
- Simplifying integer exponents
- Perbedaan linear programming dan integer programming
- Integer acrostic poem
- Maksud nombor nisbah
- Negative wrapped convolution
- What is multiplicative inverse
- Program primer
- Uses crt
- Integer generator
- Linear integer programming
- Dim i as integer
- What is an attacker goal in hijacking attacks
- This operator performs integer division.
- Integer programming problem
- Integer chips virtual manipulative
- Vba unsigned integer
- Var a b integer
- Java fast fourier transform
- How to simplify integer exponents
- Signed integer representation
- 126 (desimal) = … (biner)
- Real life integers
- Pascal integer
- Subtracting integers 1-3=
- Fixed charge problem integer programming
- Integer programming mit
- Dmca square root
- Integer programming example
- Integer pecahan
- Gomory cutting plane method
- Adding integers with different signs
- Integer
- Integer
- Rational irrational numbers
- Scanf in pseudocode
- Non integer
- Logarithmic parent function domain and range
- Integer quantum hall effect