Unit2 DLP in VECTOR SIMD AND GPU ARCHITECTUREs

  • Slides: 36
Download presentation
Unit-2 : DLP in VECTOR, SIMD AND GPU ARCHITECTUREs Vector architecture SIMD instruction set

Unit-2 : DLP in VECTOR, SIMD AND GPU ARCHITECTUREs Vector architecture SIMD instruction set extensions for multimedia Graphics Processing Units Detecting and Enhancing Loop Level Parallelism Case studies. IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 1

Vector architecture n Basic idea: n n Read sets of data elements into “vector

Vector architecture n Basic idea: n n Read sets of data elements into “vector registers” Operate on those registers Disperse the results back into memory Registers are controlled by compiler n n Used to hide memory latency Leverage memory bandwidth Each core in a heterogeneous multi-core processing unit can be designed in order to utilize different architectures such as Superscalar, VLIW, Vector processing, SIMD and multithreading IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 2

Vector architecture Intel has always been the benchmark for computational power while AMD had

Vector architecture Intel has always been the benchmark for computational power while AMD had the last word in terms of graphics and gaming. These companies have realized lately that they cannot afford to stick on to their boundary of capability and serve the needs of a limited group of people. The need of the hour is a solution which provides a balanced and commendable performance for both computation as well as graphics intensive applications and games. IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 3

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 4

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 4

Vector Architectures In 70‐ 80 s, Supercomputer � Vector machine • Definition of supercomputer

Vector Architectures In 70‐ 80 s, Supercomputer � Vector machine • Definition of supercomputer – Fastest machine in the world at given task – A device to turn a compute‐bound problem into an I/O‐bound problem – CDC 6600 (Cray, 1964) is regarded as the first supercomputer • Vector supercomputers (epitomized by Cray‐ 1, 1976) – Scalar unit + vector extensions • Vector registers, vector instructions • Vector loads/stores • Highly pipelined functional units IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 5

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 6

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 6

SIMD instruction set extensions for multimedia SIMD architectures can exploit significant data-level parallelism for:

SIMD instruction set extensions for multimedia SIMD architectures can exploit significant data-level parallelism for: matrix-oriented scientific computing media-oriented image and sound processors SIMD is more energy efficient than MIMD Only needs to fetch one instruction per data operation Makes SIMD attractive for personal mobile devices SIMD allows programmer to continue to think sequentially IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 7

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 8

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 8

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 9

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 9

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 10

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 10

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 11

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 11

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 12

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 12

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 13

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 13

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 14

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 14

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 15

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 15

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 16

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 16

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 17

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 17

Graphics Processing Units IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 18

Graphics Processing Units IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 18

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 19

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 19

 • Processor manufacturers are constantly challenged to build better, faster and more stable

• Processor manufacturers are constantly challenged to build better, faster and more stable processor architectures and designs IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 20

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 21

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 21

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 22

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 22

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 23

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 23

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 24

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 24

Considering this requirement, AMD acquired ATI, which was the manufacturers of their graphics processing

Considering this requirement, AMD acquired ATI, which was the manufacturers of their graphics processing units in the past. This move was made by AMD in order to accelerate and align the development of their “A” series processors and FX technology. The A series processors are highly capable Quad core processing units. AMD has created a new terminology called APU or Accelerated processing units in which they combine multi core processing units and the graphics processing units using an accelerator IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 25

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 26

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 26

Detecting and Enhancing Loop Level Parallelism IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 27

Detecting and Enhancing Loop Level Parallelism IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 27

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 28

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 28

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 29

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 29

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 30

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 30

Case studies IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 31

Case studies IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 31

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 32

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 32

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 33

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 33

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 34

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 34

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 35

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 35

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 36

IFETCE/ME/CSE/B. V. R. Raju/Iyear/Isem/CP 7103/MCA/Unit-2/PPt/Ver 1. 0 36