Computer Architecture CS 3330 Fundamental Concepts and ISA

  • Slides: 39
Download presentation
Computer Architecture CS 3330 Fundamental Concepts and ISA Samira Khan University of Virginia Jan

Computer Architecture CS 3330 Fundamental Concepts and ISA Samira Khan University of Virginia Jan 21, 2020 The content and concept of this course are adapted from CMU ECE 447

AGENDA • Review from last lecture • Fundamental concepts – Computing models • Instruction

AGENDA • Review from last lecture • Fundamental concepts – Computing models • Instruction Set Architecture (ISA) 2

LOGISTICS • Course Logistics • Assignments: – HW 0 due today – HW 1

LOGISTICS • Course Logistics • Assignments: – HW 0 due today – HW 1 will be out tonight, due on Jan 30, 2020 – CS 3330 TAs will discuss the questions tomorrow during the lab session • The homework consists of some problems and a review of a paper 3

WHY STUDY COMPUTER ARCHITECTURE? • Enable better systems: make computers faster, cheaper, smaller, more

WHY STUDY COMPUTER ARCHITECTURE? • Enable better systems: make computers faster, cheaper, smaller, more reliable, … – By exploiting advances and changes in underlying technology/circuits • Enable new applications – Life-like 3 D visualization 20 years ago? – Virtual reality? – Personalized genomics? Personalized medicine? • Enable better solutions to problems – Software innovation is built into trends and changes in computer architecture • > 50% performance improvement per year has enabled this innovation • Understand why computers work the way they do 4

COMPUTER ARCHITECTURE TODAY • Computing landscape is very different from 10 -20 years ago

COMPUTER ARCHITECTURE TODAY • Computing landscape is very different from 10 -20 years ago • Both UP (software and humanity trends) and DOWN (technologies and their issues), FORWARD and BACKWARD, and the resulting requirements and constraints Hybrid Main Memory Persistent Memory/Storage Microsoft Catapult (FPGA) Heterogeneous Processors General Purpose GPUs Every component and its interfaces, as well as entire system 5 designs are being re-examined

THE STRUCTURE OF SCIENTIFIC REVOLUTIONS Pre-paradigm Scientific Revolution Crisis and Emergence of 3 Scientific

THE STRUCTURE OF SCIENTIFIC REVOLUTIONS Pre-paradigm Scientific Revolution Crisis and Emergence of 3 Scientific Theory 4 Normal Science Anomaly History of Science 2 1 0

THE VON NEUMANN MODEL/ARCHITECTURE • Also called stored program computer (instructions in memory). Two

THE VON NEUMANN MODEL/ARCHITECTURE • Also called stored program computer (instructions in memory). Two key properties: • Stored program – Instructions stored in a linear memory array – Memory is unified between instructions and data • The interpretation of a stored value depends on the control signals When is a value interpreted as an instruction? • Sequential instruction processing – One instruction processed (fetched, executed, and completed) at a time – Program counter (instruction pointer) identifies the current instr. – Program counter is advanced sequentially except for control transfer instructions 7

THE VON NEUMANN MODEL/ARCHITECTURE • Recommended reading – Burks, Goldstein, Von Neumann, “Preliminary discussion

THE VON NEUMANN MODEL/ARCHITECTURE • Recommended reading – Burks, Goldstein, Von Neumann, “Preliminary discussion of the logical design of an electronic computing instrument, ” 1946. • Stored program • Sequential instruction processing 8

THE DATA FLOW MODEL (OF A COMPUTER) • Von Neumann model: An instruction is

THE DATA FLOW MODEL (OF A COMPUTER) • Von Neumann model: An instruction is fetched and executed in control flow order – As specified by the instruction pointer – Sequential unless explicit control flow instruction • Dataflow model: An instruction is fetched and executed in data flow order – i. e. , when its operands are ready – i. e. , there is no instruction pointer – Instruction ordering specified by data flow dependence • Each instruction specifies “who” should receive the result • An instruction can “fire” whenever all operands are received – Potentially many instructions can execute at the same time • Inherently more parallel 9

VON NEUMANN VS DATAFLOW • Consider a Von Neumann program – What is the

VON NEUMANN VS DATAFLOW • Consider a Von Neumann program – What is the significance of the program order? – What is the significance of the storage locations? a b v <= a + b; w <= b * 2; x <= v - w y <= v + w z <= x * y + *2 - + Sequential * Dataflow z • Which model is more natural to you as a programmer? 10

MORE ON DATA FLOW • In a data flow machine, a program consists of

MORE ON DATA FLOW • In a data flow machine, a program consists of data flow nodes – A data flow node fires (fetched and executed) when all it inputs are ready • i. e. when all inputs have tokens • Data flow node and its ISA representation 11

DATA FLOW NODES 12

DATA FLOW NODES 12

An Example

An Example

What does this model perform? val = a ^ b

What does this model perform? val = a ^ b

What does this model perform? val = a ^ b val =! 0

What does this model perform? val = a ^ b val =! 0

What does this model perform? val = a ^ b val =! 0 val

What does this model perform? val = a ^ b val =! 0 val &= val - 1

What does this model perform? val = a ^ b val =! 0 val

What does this model perform? val = a ^ b val =! 0 val &= val - 1; dist = 0 dist++;

Hamming Distance int hamming_distance (unsigned a, unsigned b) { int dist = 0; unsigned

Hamming Distance int hamming_distance (unsigned a, unsigned b) { int dist = 0; unsigned val = a ^ b; // Count the number of bits set while (val != 0) { // A bit is set, so increment the count and clear the bit dist++; val &= val - 1; } // Return the number of differing bits return dist; }

Hamming Distance • Number of positions at which the corresponding symbols are different. •

Hamming Distance • Number of positions at which the corresponding symbols are different. • The Hamming distance between: – "karolin" and "kathrin" is 3 – 1011101 and 1001001 is 2 – 2173896 and 2233796 is 3

RICHARD HAMMING • • Best known for Hamming Code Won Turing Award in 1968

RICHARD HAMMING • • Best known for Hamming Code Won Turing Award in 1968 Was part of the Manhattan Project Worked in Bell Labs for 30 years • You and Your Research is mainly his advice to other researchers • Had given the talk many times during his life time • http: //www. cs. virginia. edu/~robins/You. And. Your. R esearch. html 20

DATA FLOW CHARACTERISTICS • Data-driven execution of instruction-level graphical code – Nodes are operators

DATA FLOW CHARACTERISTICS • Data-driven execution of instruction-level graphical code – Nodes are operators – Arcs are data (I/O) – As opposed to control-driven execution • Only real dependencies constrain processing • No sequential I-stream – No program counter • Operations execute asynchronously • Execution triggered by the presence of data 21

DATA FLOW ADVANTAGES/DISADVANTAGES • Advantages – Very good at exploiting irregular parallelism – Only

DATA FLOW ADVANTAGES/DISADVANTAGES • Advantages – Very good at exploiting irregular parallelism – Only real dependencies constrain processing • Disadvantages – Debugging difficult (no precise state) • Interrupt/exception handling is difficult (what is precise state semantics? ) – Too much parallelism? (Parallelism control needed) – High bookkeeping overhead (tag matching, data storage) – Instruction cycle is inefficient (delay between dependent instructions), memory locality is not exploited 22

DATA FLOW SUMMARY • Availability of data determines order of execution • A data

DATA FLOW SUMMARY • Availability of data determines order of execution • A data flow node fires when its sources are ready • Programs represented as data flow graphs (of nodes) • Data Flow at the ISA level has not been (as) successful • Data Flow implementations under the hood (while preserving sequential ISA semantics) have been successful – Out of order execution – Hwu and Patt, “HPSm, a high performance restricted data flow architecture having minimal functionality, ” ISCA 1986. 23

Monsoon Dataflow Processor 1990 24

Monsoon Dataflow Processor 1990 24

Let’s Get Back to the Von Neumann Model • But, if you want to

Let’s Get Back to the Von Neumann Model • But, if you want to learn more about dataflow… • Dennis and Misunas, “A preliminary architecture for a basic data-flow processor, ” ISCA 1974. • Gurd et al. , “The Manchester prototype dataflow computer, ” CACM 1985. • We will have a later lecture on it! • If you are really impatient: – http: //www. youtube. com/watch? v=D 2 uue 7 iz. U 2 c – http: //www. ece. cmu. edu/~ece 740/f 13/lib/exe/fetch. php? media=onur-740 -fall 13 -module 5. 2. 1 -dataflow-part 1. ppt 25

The Von-Neumann Model • All major instruction set architectures today use this model –

The Von-Neumann Model • All major instruction set architectures today use this model – x 86, ARM, MIPS, SPARC, Alpha, POWER • Underneath (at the microarchitecture level), the execution model of almost all implementations (or, microarchitectures) is very different – – Pipelined instruction execution: Intel 80486 uarch Multiple instructions at a time: Intel Pentium uarch Out-of-order execution: Intel Pentium Pro uarch Separate instruction and data caches • But, what happens underneath that is not consistent with the von Neumann model is not exposed to software – Difference between ISA and microarchitecture 26

LEVELS OF TRANSFORMATION • ISA – Agreed upon interface between software and hardware •

LEVELS OF TRANSFORMATION • ISA – Agreed upon interface between software and hardware • SW/compiler assumes, HW promises – What the software writer needs to know to write system/user programs • Microarchitecture – Specific implementation of an ISA – Not visible to the software • Microprocessor – ISA, uarch, circuits – “Architecture” = ISA + microarchitecture Problem Algorithm Program/Language ISA Microarchitecture Logic Circuits 27

ISA VS. MICROARCHITECTURE • What is part of ISA vs. Uarch? – Gas pedal:

ISA VS. MICROARCHITECTURE • What is part of ISA vs. Uarch? – Gas pedal: interface for “acceleration” – Internals of the engine: implements “acceleration” – Add instruction vs. Adder implementation • Implementation (uarch) can be various as long as it satisfies the specification (ISA) – Bit serial, ripple carry, carry lookahead adders – x 86 ISA has many implementations: 286, 386, 486, Pentium Pro, … • Uarch usually changes faster than ISA – Few ISAs (x 86, SPARC, MIPS, Alpha) but many uarchs – Why? 28

 • Instructions ISA – Opcodes, Addressing Modes, Data Types – Instruction Types and

• Instructions ISA – Opcodes, Addressing Modes, Data Types – Instruction Types and Formats – Registers, Condition Codes • Memory – Address space, Addressability, Alignment – Virtual memory management • • • Call, Interrupt/Exception Handling Access Control, Priority/Privilege I/O Task Management Power and Thermal Management Multi-threading support, Multiprocessor support 29

Microarchitecture • Implementation of the ISA under specific design constraints and goals • Anything

Microarchitecture • Implementation of the ISA under specific design constraints and goals • Anything done in hardware without exposure to software – – – – – Pipelining In-order versus out-of-order instruction execution Memory access scheduling policy Speculative execution Superscalar processing (multiple instruction issue? ) Clock gating Caching? Levels, size, associativity, replacement policy Prefetching? Voltage/frequency scaling? Error correction? 30

 • We did not cover these slides. They are for you benefit. 31

• We did not cover these slides. They are for you benefit. 31

DESIGN POINT • A set of design considerations and their importance – leads to

DESIGN POINT • A set of design considerations and their importance – leads to tradeoffs in both ISA and uarch • Considerations – – – – Cost Performance Maximum power consumption Energy consumption (battery life) Availability Reliability and Correctness (or is it? ) Time to Market Problem Algorithm Program/Language ISA Microarchitecture Logic Circuits • Design point determined by the “Problem” space (application space) 32

TRADEOFFS: SOUL OF COMPUTER ARCHITECTURE • ISA-level tradeoffs • Uarch-level tradeoffs • System and

TRADEOFFS: SOUL OF COMPUTER ARCHITECTURE • ISA-level tradeoffs • Uarch-level tradeoffs • System and Task-level tradeoffs – How to divide the labor between hardware and software 33

MANY DIFFERENT ISAS OVER DECADES • • • x 86 PDP-x: Programmed Data Processor

MANY DIFFERENT ISAS OVER DECADES • • • x 86 PDP-x: Programmed Data Processor (PDP-11) VAX IBM 360 CDC 6600 SIMD ISAs: CRAY-1, Connection Machine VLIW ISAs: Multiflow, Cydrome, IA-64 (EPIC) Power. PC, POWER RISC ISAs: Alpha, MIPS, SPARC, ARM • What are the fundamental differences? – E. g. , how instructions are specified and what they do – E. g. , how complex are the instructions 34

INSTRUCTION • Basic element of the HW/SW interface • Consists of – opcode: what

INSTRUCTION • Basic element of the HW/SW interface • Consists of – opcode: what the instruction does – operands: who it is to do it to – Example from the Alpha ISA: 35

MIPS opcode rs rt rd opcode rs rt immediate opcode immediate 6 -bit 5

MIPS opcode rs rt rd opcode rs rt immediate opcode immediate 6 -bit 5 -bit shamt 5 -bit 16 -bit funct 6 -bit R-type I-type J-type 26 -bit 36

ARM 37

ARM 37

WHAT ARE THE ELEMENTS OF AN ISA? • Instruction sequencing model – Control flow

WHAT ARE THE ELEMENTS OF AN ISA? • Instruction sequencing model – Control flow vs. data flow – Tradeoffs? • Instruction processing style – Specifies the number of “operands” an instruction “operates” on and how it does so – 0, 1, 2, 3 address machines • • 0 -address: stack machine (op, push A, pop A) 1 -address: accumulator machine (op ACC, ld A, st A) 2 -address: 2 -operand machine (op S, D; one is both source and dest) 3 -address: 3 -operand machine (op S 1, S 2, D; source and dest separate) – Tradeoffs? • Larger operate instructions vs. more executed operations • Code size vs. execution time vs. on-chip memory space 38

Computer Architecture CS 3330 Fundamental Concepts and ISA Samira Khan University of Virginia Jan

Computer Architecture CS 3330 Fundamental Concepts and ISA Samira Khan University of Virginia Jan 21, 2020 The content and concept of this course are adapted from CMU ECE 447