COMPUTER ORGANIZATION AND DESIGN 5 th The HardwareSoftware








































- Slides: 40
COMPUTER ORGANIZATION AND DESIGN 5 th The Hardware/Software Interface Chapter 1 Computer Abstractions and Technology Edition
n Progress in computer technology n n Makes novel applications feasible n n n Underpinned by Moore’s Law § 1. 1 Introduction The Computer Revolution Computers in automobiles Cell phones Human genome project World Wide Web Search Engines Computers are pervasive Chapter 1 — Computer Abstractions and Technology — 2
Classes of Computers n Personal computers n n n General purpose, variety of software Subject to cost/performance tradeoff Server computers n n n Network based High capacity, performance, reliability Range from small servers to building sized Chapter 1 — Computer Abstractions and Technology — 3
Classes of Computers n Supercomputers n n n High-end scientific and engineering calculations Highest capability but represent a small fraction of the overall computer market Embedded computers n n Hidden as components of systems Stringent power/performance/cost constraints Chapter 1 — Computer Abstractions and Technology — 4
The Post. PC Era Chapter 1 — Computer Abstractions and Technology — 5
The Post. PC Era n Personal Mobile Device (PMD) n n n Battery operated Connects to the Internet Hundreds of dollars Smart phones, tablets, electronic glasses Cloud computing n n Warehouse Scale Computers (WSC) Software as a Service (Saa. S) Portion of software run on a PMD and a portion run in the Cloud Amazon and Google Chapter 1 — Computer Abstractions and Technology — 6
What You Will Learn n How programs are translated into the machine language n n n The hardware/software interface What determines program performance n n n And how the hardware executes them And how it can be improved How hardware designers improve performance What is parallel processing Chapter 1 — Computer Abstractions and Technology — 7
Understanding Performance n Algorithm n n Programming language, compiler, architecture n n Determine number of machine instructions executed per operation Processor and memory system n n Determines number of operations executed Determine how fast instructions are executed I/O system (including OS) n Determines how fast I/O operations are executed Chapter 1 — Computer Abstractions and Technology — 8
n Design for Moore’s Law n Use abstraction to simplify design n Make the common case fast n Performance via parallelism n Performance via pipelining n Performance via prediction n Hierarchy of memories n Dependability via redundancy § 1. 2 Eight Great Ideas in Computer Architecture Eight Great Ideas Chapter 1 — Computer Abstractions and Technology — 9
n Application software n n Written in high-level language System software n n Compiler: translates HLL code to machine code Operating System: service code n n § 1. 3 Below Your Program Handling input/output Managing memory and storage Scheduling tasks & sharing resources Hardware n Processor, memory, I/O controllers Chapter 1 — Computer Abstractions and Technology — 10
Levels of Program Code n High-level language n n n Assembly language n n Level of abstraction closer to problem domain Provides for productivity and portability Textual representation of instructions Hardware representation n n Binary digits (bits) Encoded instructions and data Chapter 1 — Computer Abstractions and Technology — 11
The BIG Picture n Same components for all kinds of computer n n Desktop, server, embedded § 1. 4 Under the Covers Components of a Computer Input/output includes n User-interface devices n n Storage devices n n Display, keyboard, mouse Hard disk, CD/DVD, flash Network adapters n For communicating with other computers Chapter 1 — Computer Abstractions and Technology — 12
Touchscreen n Post. PC device Supersedes keyboard and mouse Resistive and Capacitive types n n Most tablets, smart phones use capacitive Capacitive allows multiple touches simultaneously Chapter 1 — Computer Abstractions and Technology — 13
Inside the Processor (CPU) n n n Datapath: performs operations on data Control: sequences datapath, memory, . . . Cache memory n Small fast SRAM memory for immediate access to data Chapter 1 — Computer Abstractions and Technology — 14
Abstractions The BIG Picture n Abstraction helps us deal with complexity n n Instruction set architecture (ISA) n n The hardware/software interface Application binary interface n n Hide lower-level detail The ISA plus system software interface Implementation n The details underlying and interface Chapter 1 — Computer Abstractions and Technology — 15
A Safe Place for Data n Volatile main memory n n Loses instructions and data when power off Non-volatile secondary memory n n n Magnetic disk Flash memory Optical disk (CDROM, DVD) Chapter 1 — Computer Abstractions and Technology — 16
Networks n n Communication, resource sharing, nonlocal access Local area network (LAN): Ethernet Wide area network (WAN): the Internet Wireless network: Wi. Fi, Bluetooth Chapter 1 — Computer Abstractions and Technology — 17
n Electronics technology continues to evolve n n Increased capacity and performance Reduced cost DRAM capacity Year Technology Relative performance/cost 1951 Vacuum tube 1965 Transistor 1975 Integrated circuit (IC) 1995 Very large scale IC (VLSI) 2013 Ultra large scale IC 1 35 900 2, 400, 000 § 1. 5 Technologies for Building Processors and Memory Technology Trends 250, 000, 000 Chapter 1 — Computer Abstractions and Technology — 18
n Which airplane has the best performance? § 1. 6 Performance Defining Performance Chapter 1 — Computer Abstractions and Technology — 19
Response Time and Throughput n Response time n n How long it takes to do a task Throughput n Total work done per unit time n n How are response time and throughput affected by n n n e. g. , tasks/transactions/… per hour Replacing the processor with a faster version? Adding more processors? We’ll focus on response time for now… Chapter 1 — Computer Abstractions and Technology — 20
Relative Performance n Define Performance = 1/Execution Time “X is n time faster than Y” n Example: time taken to run a program n n 10 s on A, 15 s on B Execution Time. B / Execution Time. A = 15 s / 10 s = 1. 5 So A is 1. 5 times faster than B Chapter 1 — Computer Abstractions and Technology — 21
Measuring Execution Time n Elapsed time n Total response time, including all aspects n n n Processing, I/O, OS overhead, idle time Determines system performance CPU time n Time spent processing a given job n n n Discounts I/O time, other jobs’ shares Comprises user CPU time and system CPU time Different programs are affected differently by CPU and system performance Chapter 1 — Computer Abstractions and Technology — 22
CPU Clocking n Operation of digital hardware governed by a constant-rate clock Clock period Clock (cycles) Data transfer and computation Update state n Clock period: duration of a clock cycle n n e. g. , 250 ps = 0. 25 ns = 250× 10– 12 s Clock frequency (rate): cycles per second n e. g. , 4. 0 GHz = 4000 MHz = 4. 0× 109 Hz Chapter 1 — Computer Abstractions and Technology — 23
CPU Time n Performance improved by n n n Reducing number of clock cycles Increasing clock rate Hardware designer must often trade off clock rate against cycle count Chapter 1 — Computer Abstractions and Technology — 24
CPU Time Example n n Computer A: 2 GHz clock, 10 s CPU time Designing Computer B n n n Aim for 6 s CPU time Can do faster clock, but causes 1. 2 × clock cycles How fast must Computer B clock be? Chapter 1 — Computer Abstractions and Technology — 25
Instruction Count and CPI n Instruction Count for a program n n Determined by program, ISA and compiler Average cycles per instruction n n Determined by CPU hardware If different instructions have different CPI n Average CPI affected by instruction mix Chapter 1 — Computer Abstractions and Technology — 26
CPI Example n n Computer A: Cycle Time = 250 ps, CPI = 2. 0 Computer B: Cycle Time = 500 ps, CPI = 1. 2 Same ISA Which is faster, and by how much? A is faster… …by this much Chapter 1 — Computer Abstractions and Technology — 27
CPI in More Detail n If different instruction classes take different numbers of cycles n Weighted average CPI Relative frequency Chapter 1 — Computer Abstractions and Technology — 28
CPI Example n n Alternative compiled code sequences using instructions in classes A, B, C Class A B C CPI for class 1 2 3 IC in sequence 1 2 IC in sequence 2 4 1 1 Sequence 1: IC = 5 n n Clock Cycles = 2× 1 + 1× 2 + 2× 3 = 10 Avg. CPI = 10/5 = 2. 0 n Sequence 2: IC = 6 n n Clock Cycles = 4× 1 + 1× 2 + 1× 3 =9 Avg. CPI = 9/6 = 1. 5 Chapter 1 — Computer Abstractions and Technology — 29
Performance Summary The BIG Picture n Performance depends on n n Algorithm: affects IC, possibly CPI Programming language: affects IC, CPI Compiler: affects IC, CPI Instruction set architecture: affects IC, CPI, Tc Chapter 1 — Computer Abstractions and Technology — 30
§ 1. 7 The Power Wall Power Trends Chapter 1 — Computer Abstractions and Technology — 31
§ 1. 8 The Sea Change: The Switch to Multiprocessors Uniprocessor Performance Constrained by power, instruction-level parallelism, memory latency Chapter 1 — Computer Abstractions and Technology — 32
Multiprocessors n Multicore microprocessors n n More than one processor per chip Requires explicitly parallel programming n Compare with instruction level parallelism n n n Hardware executes multiple instructions at once Hidden from the programmer Hard to do n n n Programming for performance Load balancing Optimizing communication and synchronization Chapter 1 — Computer Abstractions and Technology — 33
SPEC CPU Benchmark n Programs used to measure performance n n Standard Performance Evaluation Corp (SPEC) n n Supposedly typical of actual workload Develops benchmarks for CPU, I/O, Web, … SPEC CPU 2006 n Elapsed time to execute a selection of programs n n n Negligible I/O, so focuses on CPU performance Normalize relative to reference machine Summarize as geometric mean of performance ratios n CINT 2006 (integer) and CFP 2006 (floating-point) Chapter 1 — Computer Abstractions and Technology — 34
CINT 2006 for Intel Core i 7 920 Chapter 1 — Computer Abstractions and Technology — 35
SPECpower_ssj 2008 for Xeon X 5650 Chapter 1 — Computer Abstractions and Technology — 36
n n Improving an aspect of a computer and expecting a proportional improvement in overall performance Example: multiply accounts for 80 s/100 s n How much improvement in multiply performance to get 5× overall? n n § 1. 10 Fallacies and Pitfalls Pitfall: Amdahl’s Law Can’t be done! Corollary: make the common case fast Chapter 1 — Computer Abstractions and Technology — 37
Fallacy: Low Power at Idle n Look back at i 7 power benchmark n n Google data center n n n At 100% load: 258 W At 50% load: 170 W (66%) At 10% load: 121 W (47%) Mostly operates at 10% – 50% load At 100% load less than 1% of the time Consider designing processors to make power proportional to load Chapter 1 — Computer Abstractions and Technology — 38
Pitfall: MIPS as a Performance Metric n MIPS: Millions of Instructions Per Second n Doesn’t account for n n n Differences in ISAs between computers Differences in complexity between instructions CPI varies between programs on a given CPU Chapter 1 — Computer Abstractions and Technology — 39
n Cost/performance is improving n n Hierarchical layers of abstraction n In both hardware and software Instruction set architecture n n Due to underlying technology development § 1. 9 Concluding Remarks The hardware/software interface Execution time: the best performance measure Power is a limiting factor n Use parallelism to improve performance Chapter 1 — Computer Abstractions and Technology — 40