CMPUT 429CMPE 382 Winter 2001 Topic 2 Technology
- Slides: 28
CMPUT 429/CMPE 382 Winter 2001 Topic 2: Technology Trend and Cost/Performance (Adapted from David A. Patterson’s CS 252 lecture slides at Berkeley) 1/17/01 CS 252/Patterson Lec 1. 1
Technology Trends: Microprocessor Capacity “Graduation Window” Moore’s Law Alpha 21264: 15 million Pentium Pro: 5. 5 million Power. PC 620: 6. 9 million Alpha 21164: 9. 3 million Sparc Ultra: 5. 2 million CMOS improvements: • Die size: 2 X every 3 yrs • Line width: halve / 7 yrs 1/17/01 CS 252/Patterson Lec 1. 2
Memory Capacity (Single Chip DRAM) 1/17/01 year size(Mb) cyc time 1980 0. 0625 250 ns 1983 0. 25 220 ns 1986 1 190 ns 1989 4 165 ns 1992 16 145 ns 1996 64 CS 252/Patterson 120 ns Lec 1. 3
Technology Trends (Summary) 1/17/01 Capacity Speed (latency) Logic 2 x in 3 years DRAM 4 x in 3 years 2 x in 10 years Disk 4 x in 3 years 2 x in 10 years CS 252/Patterson Lec 1. 4
Processor Performance Trends 1000 Supercomputers 100 Mainframes 10 Minicomputers Microprocessors 1 0. 1 1965 1970 1975 1980 1985 1990 1995 2000 Year 1/17/01 CS 252/Patterson Lec 1. 5
Processor Performance (1. 35 X before, 1. 55 X now) 1. 54 X/yr 1/17/01 CS 252/Patterson Lec 1. 6
Performance Trends (Summary) • Workstation performance (measured in Spec Marks) improves roughly 50% per year (2 X every 18 months) • Improvement in cost performance estimated at 70% per year 1/17/01 CS 252/Patterson Lec 1. 7
Computer Architecture Topics Input/Output and Storage Disks, WORM, Tape Emerging Technologies Interleaving Bus protocols DRAM Memory Hierarchy VLSI Coherence, Bandwidth, Latency L 2 Cache L 1 Cache Instruction Set Architecture Addressing, Protection, Exception Handling Pipelining, Hazard Resolution, Superscalar, Reordering, Prediction, Speculation, Vector, DSP 1/17/01 RAID Pipelining and Instruction Level Parallelism CS 252/Patterson Lec 1. 8
Computer Architecture Topics P M P S M ° ° ° P M Interconnection Network Processor-Memory-Switch Multiprocessors Networks and Interconnections 1/17/01 Shared Memory, Message Passing, Data Parallelism Network Interfaces Topologies, Routing, Bandwidth, Latency, Reliability CS 252/Patterson Lec 1. 9
Course Focus Technology Parallelism Applications Computer Architecture: • Instruction Set Design • Organization • Hardware Operating Systems 1/17/01 Measurement & Evaluation Programming Languages Interface Design (ISA) History CS 252/Patterson Lec 1. 10
Measurement Tools • Benchmarks, Traces, Mixes • Hardware: Cost, delay, area, power estimation • Simulation (many levels) – ISA, RT, Gate, Circuit • Queueing Theory • Rules of Thumb • Fundamental “Laws”/Principles 1/17/01 CS 252/Patterson Lec 1. 11
Which is faster? Plane DC to Paris Speed Passengers Throughput (pmph) Boeing 747 6. 5 hours 610 mph 470 286, 700 BAD/Sud Concodre 3 hours 1350 mph 132 178, 200 • Time to run the task (Ex. Time) – Execution time, response time, latency • Tasks per day, hour, week, sec, ns … (Performance) – Throughput, bandwidth 1/17/01 CS 252/Patterson Lec 1. 12
Definitions • Performance is in units of things per sec – bigger is better • If we are primarily concerned with response time – performance(x) = 1 execution_time(x) " X is n times faster than Y" means Performance(X) n = Performance(Y) 1/17/01 Execution_time(Y) = Execution_time(X) CS 252/Patterson Lec 1. 13
Cycles Per Instruction IC = Instruction Count CPI = Clock Per Instruction 1/17/01 CS 252/Patterson Lec 1. 14
Cycles Per Instruction We may separate the contribution of each type of instruction to the execution time defining: 1/17/01 CS 252/Patterson Lec 1. 15
Example: Calculating CPI Base Machine Op ALU Load Store Branch (Reg / Freq 50% 20% 10% 20% Reg) Cycles 1 2 2 2 Typical Mix of instruction types in program 1/17/01 CPI(i). 5. 4. 2. 4 1. 5 (% Time) (33%) (27%) (13%) (27%) CS 252/Patterson Lec 1. 16
Aspects of CPU Performance (CPU Law) CPU time = Seconds = Instructions x Cycles x Seconds Program Instruction Cycle Inst Count CPI Clock Rate Program X Compiler X (X) Inst. Set. X X Organization Technology 1/17/01 X X X CS 252/Patterson Lec 1. 17
Amdahl's Law Speedup due to enhancement E: Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected 1/17/01 CS 252/Patterson Lec 1. 18
Amdahl’s Law 1/17/01 CS 252/Patterson Lec 1. 19
Amdahl’s Law • Example: Floating point instructions improved to run 2 X; but only 10% of actual instructions are FP 1/17/01 CS 252/Patterson Lec 1. 20
Metrics of Performance Application Answers per month Operations per second Programming Language Compiler ISA (millions) of Instructions per second: MIPS (millions) of (FP) operations per second: MFLOP/s Datapath Control Function Units Transistors Wires Pins 1/17/01 Megabytes per second Cycles per second (clock rate) CS 252/Patterson Lec 1. 21
SPEC: System Performance Evaluation Cooperative • First Round 1989 – 10 programs yielding a single number (“SPECmarks”) • Second Round 1992 – SPECInt 92 (6 integer programs) and SPECfp 92 (14 floating point programs) » Compiler Flags unlimited. March 93 of DEC 4000 Model 610: spice: unix. c: /def=(sysv, has_bcopy, ”bcopy(a, b, c)= memcpy(b, a, c)” wave 5: /ali=(all, dcom=nat)/ag=a/ur=4/ur=200 nasa 7: /norecu/ag=a/ur=4/ur 2=200/lc=blas • Third Round 1995 1/17/01 – new set of programs: SPECint 95 (8 integer programs) and SPECfp 95 (10 floating point) – “benchmarks useful for 3 years” – Single flag setting for all programs: SPECint_base 95, SPECfp_base 95 CS 252/Patterson Lec 1. 22
How to Summarize Performance • Arithmetic mean (weighted arithmetic mean) tracks execution time: � (Ti)/n or � (Wi*Ti) • Harmonic mean (weighted harmonic mean) of rates (e. g. , MFLOPS) tracks execution time: n/� (1/Ri) or n/� (Wi/Ri) • Normalized execution time is handy for scaling performance (e. g. , X times faster than SPARCstation 10) • But do not take the arithmetic mean of normalized execution time, use the geometrici)^1/n) 1/17/01 CS 252/Patterson Lec 1. 23
Performance Evaluation • “For better or worse, benchmarks shape a field” • Good products created when have: – Good benchmarks – Good ways to summarize performance • Given sales is a function in part of performance relative to competition, investment in improving product as reported by performance summary • If benchmarks/summary inadequate, then choose between improving product for real programs vs. improving product to get more sales; Sales almost always wins! • Execution time is the measure of computer performance! 1/17/01 CS 252/Patterson Lec 1. 24
Instruction Set Architecture (ISA) software instruction set hardware 1/17/01 CS 252/Patterson Lec 1. 25
Interface Design A good interface: • Lasts through many implementations (portability, compatability) • Is used in many differeny ways (generality) • Provides convenient functionality to higher levels • Permits an efficient implementation at lower levels use use 1/17/01 Interface imp 1 time imp 2 imp 3 CS 252/Patterson Lec 1. 26
Summary, #1 • Designing to Last through Trends Capacity • Speed Logic 2 x in 3 years DRAM 4 x in 3 years 2 x in 10 years Disk 4 x in 3 years 2 x in 10 years 6 yrs to graduate => 16 X CPU speed, DRAM/Disk size • Time to run the task – Execution time, response time, latency • Tasks per day, hour, week, sec, ns, … – Throughput, bandwidth • “X is n times faster than Y” means Ex. Time(Y) ----Ex. Time(X) 1/17/01 = Performance(X) -------Performance(Y) CS 252/Patterson Lec 1. 27
Summary, #2 • Amdahl’s Law: Speedupoverall = • CPI Law: CPU time Ex. Timeold Ex. Timenew 1 = (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced = Seconds = Instructions x Cycles x Seconds Program Instruction Cycle • Execution time is the REAL measure of computer performance! • Good products created when have: – Good benchmarks, good ways to summarize performance 1/17/01 • Die Cost goes roughly with die area 4 • Can PC industry support engineering/research investment? CS 252/Patterson Lec 1. 28
- Cmput 382
- Cmput 382
- Winter kommt winter kommt flocken fallen nieder
- Winter kommt winter kommt flocken fallen nieder lied
- Es ist kalt es ist kalt flocken fallen nieder
- Ce 382
- Kj 445
- Kj 382
- Btm 382
- Comp 382
- 49 cfr 382
- Btm 382
- Ce 382
- Btm 382
- John molson school of business
- Example of clincher sentence
- Topic down
- Cmput 229
- Cmput 229
- Cmput 365
- Cmput 428
- Cmput 229
- Cmput
- Cmput 603
- Cmput 365
- Cmput 101
- Martin jagersand
- Cmput 301
- Cmput 267