Systolic Array HW Enrique Montealegre CDA 4150 Computer




















- Slides: 20
Systolic Array HW Enrique Montealegre CDA 4150 Computer Architecture Dr. Montagne Fall 2005
Problem Using the linear array explained in class, create a Power. Point presentation to show all the steps to execute two matrix vector products simultaneously, say: y = Ax and w = Bz. The matrices must be 3 x 3 matrices and the systolic array must have only 5 processing elements. a) How many steps are necessary to carry out the simultaneous computations? b) Assuming that the input vector is called the carrier vector, where in the carrier vector are the elements of x stored? c) Where in the output vector are the elements of w stored?
The matrices
Matrix Multiplication To calculate Y we follow this formula: y 1 = a 11*x 1 + a 12*x 2 + a 13*x 3 y 2 = a 21*x 1 + a 22*x 2 + a 23*x 3 y 3 = a 31*x 1 + a 32*x 2 + a 33*x 3 To calculate W we follow this formula: w 1 = b 11*z 1 + b 12*z 2 + b 13*z 3 w 2 = b 21*z 1 + b 22*z 2 + b 23*z 3 w 3 = b 31*z 1 + b 32*z 2 + b 33*z 3
Processing Unit
Systolic array with 5 processing Units
Set Matrix A a 33 a 23 a 13 a 32 a 22 a 12 a 31 a 21 a 11 y 1 x 3 x 2 x 1 y 2 y 3
Set Matrix B b 33 b 23 a 33 b 13 a 23 b 22 a 13 b 12 a 22 b 21 b 11 a 21 a 12 b 32 a 32 b 31 a 11 y 1 w 1 y 2 z 3 x 3 z 2 x 2 z 1 x 1 w 2 y 3 w 3
T 0 b 33 b 23 a 33 b 13 a 23 b 22 a 13 b 12 a 22 b 21 b 11 a 21 a 12 b 32 a 32 b 31 a 11 w 1 y 2 y 1 z 3 x 3 z 2 x 2 z 1 x 1 w 2 y 3 w 3
T 1 b 33 b 23 a 33 b 13 a 23 b 22 a 13 b 12 a 22 b 21 b 11 a 21 a 12 b 32 a 32 b 31 a 11 y 2 y 1 z 3 x 3 z 2 x 2 z 1 x 1 w 2 y 3 w 3
T 2 b 33 b 23 a 33 b 13 a 23 b 22 a 13 b 12 a 22 a 12 b 32 a 32 b 31 b 21 b 11 a 31 a 21 w 2 y 1 a 11 z 3 x 3 z 2 x 2 y 1 = a 11*x 1 z 1 x 1 w 1 y 2 y 3 w 3
T 3 b 33 b 23 a 33 b 13 a 23 b 22 a 13 b 12 a 22 b 32 a 32 b 31 b 21 a 31 y 3 y 1 w 1 a 12 z 3 x 3 z 2 x 2 y 1 = a 11*x 1 + a 12*x 2 y 2 = a 21* x 1 y 2 b 11 z 1 a 21 x 1 w 1 = b 11*z 1 w 2 w 3
T 4 b 33 b 13 b 23 a 33 a 23 b 22 b 32 a 32 b 31 w 3 y 1 w 1 a 13 z 3 x 3 y 2 b 12 z 2 w 2 a 22 x 2 y 3 b 21 z 1 a 31 x 1 y 1 = a 11*x 1 + a 12*x 2 + a 13*x 3 w 1 = b 11*z 1 + b 12*z 2 y 2 = a 21* x 1+ a 22*x 2 w 2 = b 21*z 1 y 3 = a 31*x 1
T 5 b 33 y 1 a 33 b 23 w 1 y 2 b 13 z 3 w 2 a 23 x 3 b 32 y 3 b 22 z 2 w 3 a 32 x 2 b 31 z 1 y 1 = in output vector w 1 = b 11*z 1+ b 12*z 2 + b 13*z 3 y 2 = a 21* x 1+ a 22*x 2 + a 23*x 3 w 2 = b 21*z 1+ b 22*z 2 y 3 = a 31* x 1+ a 32*x 2 w 3 = b 31*z 1
T 6 y 1 w 1 b 33 y 2 w 2 y 3 b 23 z 3 w 3 a 33 x 3 b 32 z 2 x 2 y 1 = in output vector w 1 = in output vector y 2 = complete answer w 2 = b 21*z 1+ b 22*z 2 + b 23*z 3 y 3 = a 31* x 1+ a 32*x 2 + a 33*x 3 w 3 = b 31*z 1+ b 32*z 2
T 7 y 1 w 1 y 2 w 2 y 3 w 3 b 33 z 3 x 3 z 2 y 1 = in output vector w 1 = in output vector y 2 = in output vector w 2 = complete answer y 3 = complete answer w 3 = b 31*z 1+ b 32*z 2 + b 33*z 3
T 8 y 1 w 1 y 2 w 2 y 3 w 3 z 3 x 3 y 1 = in output vector w 1 = in output vector y 2 = in output vector w 2 = in output vector y 3 = complete answer w 3 = complete answer
T 9 y 1 w 1 y 2 w 2 y 3 w 3 z 3 y 1 = in output vector w 1 = in output vector y 2 = in output vector w 2 = in output vector y 3 = in output vector w 3 = complete answer
T 10 y 1 w 1 y 2 w 2 y 3 w 3 y 1 = in output vector w 1 = in output vector y 2 = in output vector w 2 = in output vector y 3 = in output vector w 3 = in output vector
Answers a) How many steps are necessary to carry out the simultaneous computations? It took 11 steps b) b) Assuming that the input vector is called the carrier vector, where in the carrier vector are the elements of x stored? x 3 x 2 x 1 input vector c) Where in the output vector are the elements of w stored? w 1 output vector w 2 w 3