SUPERSCALAR ARCHITECTURE Ahmed Faraz Fall 2008 ELEC 6200

  • Slides: 9
Download presentation
SUPERSCALAR ARCHITECTURE Ahmed Faraz Fall 2008 ELEC 6200 -001 1

SUPERSCALAR ARCHITECTURE Ahmed Faraz Fall 2008 ELEC 6200 -001 1

Definition and Characteristics �Superscalar processing is the ability to initiate multiple instructions during the

Definition and Characteristics �Superscalar processing is the ability to initiate multiple instructions during the same clock cycle. �A typical Superscalar processor fetches and decodes the incoming instruction stream several instructions at a time. �Superscalar architecture exploit the potential of ILP(Instruction Level Parallelism). Fall 2008 ELEC 6200 -001 2

Fetching and dispatching two instructions per cycle Fall 2008 ELEC 6200 -001 3

Fetching and dispatching two instructions per cycle Fall 2008 ELEC 6200 -001 3

Uninterrupted stream of instructions �The outcomes of conditional branch instructions are usually predicted in

Uninterrupted stream of instructions �The outcomes of conditional branch instructions are usually predicted in advance to ensure uninterrupted stream of instructions �Instructions are initiated for execution in parallel based on the availability of operand data, rather than their original program sequence. This is referred to as dynamic instruction scheduling. �Upon completion instruction results are resequenced in the original order. Fall 2008 ELEC 6200 -001 4

Superscalar Execution Example - With Register Renaming for WAR and WAW dependencies. Fall 2008

Superscalar Execution Example - With Register Renaming for WAR and WAW dependencies. Fall 2008 ELEC 6200 -001 5

. Register Renaming Example WAR dependency exist between LD r 7, (r 3) and

. Register Renaming Example WAR dependency exist between LD r 7, (r 3) and SUB r 3, r 12, r 11 instructions With Register Renaming, the first write to r 3 maps to hw 3, while the second write maps to hw 20. This converts four instruction dependency chain into 2 two instructions chains, which can then be executed in parallel if the processor allows out of order execution. Fall 2008 ELEC 6200 -001 6

Hardware Organization of a superscalar processor Fall 2008 ELEC 6200 -001 7

Hardware Organization of a superscalar processor Fall 2008 ELEC 6200 -001 7

CONCLUSION �It thereby allows faster CPU throughput than would otherwise be possible at the

CONCLUSION �It thereby allows faster CPU throughput than would otherwise be possible at the same clock rate. �All general-purpose CPUs developed since about 1998 are superscalar. �The major problem of executing multiple instructions in a scalar program is the handling of data dependencies. If data dependencies are not effectively handled, it is difficult to achieve an execution rate of more than one instruction per clock cycle. Fall 2008 ELEC 6200 -001 8

References � THE MICRO ARCHITECTURE OF SUPERSCALAR PROCESSORS BY • JAMES E. SMITH, MEMBER,

References � THE MICRO ARCHITECTURE OF SUPERSCALAR PROCESSORS BY • JAMES E. SMITH, MEMBER, IEEE, AND GURINDAR S. SOHI, SENIOR MEMBER, IEEE � http: //en. wikipedia. org/wiki/Superscalar � http: //www. seas. gwu. edu/~bhagiweb/cs 211/lectures/superscalar. pdf � LIMITATION OF SUPERSCALAR MICROPROCESSOR PERFORMANCE • Fall 2008 THANG TRAN , ADVANCED MICRO DEVICES, INC. AUSTIN, TEXAS 78741 AND CHUAN-LIN WU, DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING ELEC 6200 -001 9