01 Finally Computer Architecture Computer Architecture program Computer

  • Slides: 98
Download presentation
01 Finally, Computer Architecture!

01 Finally, Computer Architecture!

Computer Architecture!

Computer Architecture!

program Computer Architecture!

program Computer Architecture!

program operating system Computer Architecture!

program operating system Computer Architecture!

program operating system Computer Architecture! digital logic

program operating system Computer Architecture! digital logic

program operating system Computer Architecture! organization digital logic

program operating system Computer Architecture! organization digital logic

program operating system Computer Architecture! organization digital logic

program operating system Computer Architecture! organization digital logic

? Computer Architecture!

? Computer Architecture!

problem electrons

problem electrons

problem algorithm electrons

problem algorithm electrons

problem algorithm program electrons

problem algorithm program electrons

problem algorithm program runtime system (VM, OS, MM) electrons

problem algorithm program runtime system (VM, OS, MM) electrons

problem algorithm program runtime system (VM, OS, MM) ISA (architecture) electrons

problem algorithm program runtime system (VM, OS, MM) ISA (architecture) electrons

problem algorithm program runtime system (VM, OS, MM) ISA (architecture) microarchitecture electrons

problem algorithm program runtime system (VM, OS, MM) ISA (architecture) microarchitecture electrons

problem algorithm program runtime system (VM, OS, MM) ISA (architecture) microarchitecture logic electrons

problem algorithm program runtime system (VM, OS, MM) ISA (architecture) microarchitecture logic electrons

problem algorithm program runtime system (VM, OS, MM) ISA (architecture) microarchitecture logic circuits electrons

problem algorithm program runtime system (VM, OS, MM) ISA (architecture) microarchitecture logic circuits electrons

Computer Architecture! problem algorithm program runtime system (VM, OS, MM) ISA (architecture) microarchitecture logic

Computer Architecture! problem algorithm program runtime system (VM, OS, MM) ISA (architecture) microarchitecture logic circuits electrons

Computer Architecture! challenging runtime system (VM, OS, MM) ISA (architecture) microarchitecture logic

Computer Architecture! challenging runtime system (VM, OS, MM) ISA (architecture) microarchitecture logic

Computer Architecture! challenging for me as well…runtime system (VM, OS, MM) ISA (architecture) microarchitecture

Computer Architecture! challenging for me as well…runtime system (VM, OS, MM) ISA (architecture) microarchitecture logic

for me?

for me?

Instructor Kai Bu 卜凯 Assistant Professor, College of CS, ZJU Ph. D. from Hong

Instructor Kai Bu 卜凯 Assistant Professor, College of CS, ZJU Ph. D. from Hong Kong Poly. U, 2013 Research Interests: networking, security (e. g. , software-defined networking, RFID) research interns wanted http: //list. zju. edu. cn/kaibu

How I Prepared (and am still preparing) read textbooks

How I Prepared (and am still preparing) read textbooks

How I Prepared (and am still preparing) watch video lectures

How I Prepared (and am still preparing) watch video lectures

How I Prepared (and am still preparing) practice English

How I Prepared (and am still preparing) practice English

and even read this book

and even read this book

What’s to deliver?

What’s to deliver?

How a multi-core system works?

How a multi-core system works?

Know not only how but also why

Know not only how but also why

Understand the principles

Understand the principles

Explore the tradeoffs of different designs and ideas

Explore the tradeoffs of different designs and ideas

Thought-provoking!

Thought-provoking!

Textbook Computer Architecture: A Quantitative Approach 5 th edition John L. Hennessy David A.

Textbook Computer Architecture: A Quantitative Approach 5 th edition John L. Hennessy David A. Patterson

Why This Book? • Quantitative approach: Performance driven • Know not only how but

Why This Book? • Quantitative approach: Performance driven • Know not only how but also why • As in this book Operating Systems: Three Easy Pieces

Course Website http: //list. zju. edu. cn/kaibu/comparch 2017/

Course Website http: //list. zju. edu. cn/kaibu/comparch 2017/

Syllabus Reference syllabus by Prof. Jiang http: //list. zju. edu. cn/kaibu/comparch 2015/Syllabus_2013 spring. pdf

Syllabus Reference syllabus by Prof. Jiang http: //list. zju. edu. cn/kaibu/comparch 2015/Syllabus_2013 spring. pdf Reference schedule http: //list. zju. edu. cn/kaibu/comparch 2016 fall/schedule. html

Teaching Components • Lectures • Labs • Research

Teaching Components • Lectures • Labs • Research

Teaching Components • Lectures • Labs • Research

Teaching Components • Lectures • Labs • Research

Left off in Organization Single cycle instruction execution

Left off in Organization Single cycle instruction execution

Now in Architecture Pipelining Divide instruction execution into stages

Now in Architecture Pipelining Divide instruction execution into stages

Pipelining start executing one instruction before completing the previous one

Pipelining start executing one instruction before completing the previous one

MIPS Instruction • at most 5 clock cycles per instruction • IF ID EX

MIPS Instruction • at most 5 clock cycles per instruction • IF ID EX MEM WB

MIPS Instruction IF IR ← Mem[PC]; NPC ← PC + 4;

MIPS Instruction IF IR ← Mem[PC]; NPC ← PC + 4;

MIPS Instruction IF ID A ← Regs[rs]; B ← Regs[rt]; Imm ← sign-extended immediate

MIPS Instruction IF ID A ← Regs[rs]; B ← Regs[rt]; Imm ← sign-extended immediate field of IR (lower 16 bits)

MIPS Instruction IF ALUOutput ← A + Imm; ALUOutput ← A func B; ALUOutput

MIPS Instruction IF ALUOutput ← A + Imm; ALUOutput ← A func B; ALUOutput ← A op Imm; ALUOutput ← NPC + (Imm<<2); Cond ← (A == 0); ID EX

MIPS Instruction IF ID EX MEM LMD ← Mem[ALUOutput]; Mem[ALUOutput] ← B; if (cond)

MIPS Instruction IF ID EX MEM LMD ← Mem[ALUOutput]; Mem[ALUOutput] ← B; if (cond) PC ← ALUOutput; W

MIPS Instruction IF ID EX MEM WB Regs[rd] ← ALUOutput; Regs[rt] ← LMD;

MIPS Instruction IF ID EX MEM WB Regs[rd] ← ALUOutput; Regs[rt] ← LMD;

Structural Hazard MEM Load • Example 1 mem port mem conflict Instr i+1 data

Structural Hazard MEM Load • Example 1 mem port mem conflict Instr i+1 data access vs instr fetch Instr i+2 IF Instr i+3

Data Hazard DADD R 1, R 2, R 3 DSUB R 4, R 1,

Data Hazard DADD R 1, R 2, R 3 DSUB R 4, R 1, R 5 AND R 6, R 1, R 7 No hazard OR R 8, R 1, R 9 1 st half cycle: w 2 nd half cycle: r XOR R 10, R 11 R 1

Memory Hierarchy

Memory Hierarchy

Cache Performance • Memory stall cycles the number of cycles during processor is stalled

Cache Performance • Memory stall cycles the number of cycles during processor is stalled waiting for a mem access • Miss rate number of misses over number of accesses • Miss penalty the cost per miss (number of extra

Block Placement

Block Placement

Multilevel Cache • Two-level cache Add another level of cache between the original cache

Multilevel Cache • Two-level cache Add another level of cache between the original cache and memory • L 1: small enough to match the clock cycle time of the fast processor; • L 2: large enough to capture many accesses that would go to main memory, lessening miss penalty

Virtual Memory Program uses • discontiguous memory locations • Use secondary/non-memory storage

Virtual Memory Program uses • discontiguous memory locations • Use secondary/non-memory storage

Virtual Memory Program thinks • contiguous memory locations • larger physical memory

Virtual Memory Program thinks • contiguous memory locations • larger physical memory

Virtual Memory • Paged virtual memory page: fixed-size block • Segmented virtual memory segment:

Virtual Memory • Paged virtual memory page: fixed-size block • Segmented virtual memory segment: variable-size block

Virtual Memory • Paged virtual memory page address: page # + offset • Segmented

Virtual Memory • Paged virtual memory page address: page # + offset • Segmented virtual memory segment address: seg # + offset

Address Translation • Example: Opteron data TLB Steps 1&2: send the virtual address to

Address Translation • Example: Opteron data TLB Steps 1&2: send the virtual address to all tags Step 2: check the type of mem access against protection info in TLB

Virtual Memory + Caches

Virtual Memory + Caches

Disk http: //www. cs. uic. edu/~jbell/Course. Notes/Operating. Systems/images/Chapter 10/10_01_Disk. Mechanis m. jpg

Disk http: //www. cs. uic. edu/~jbell/Course. Notes/Operating. Systems/images/Chapter 10/10_01_Disk. Mechanis m. jpg

Disk Arrays • Disk arrays with redundant disks to tolerate faults • If a

Disk Arrays • Disk arrays with redundant disks to tolerate faults • If a single disk fails, the lost information is reconstructed from redundant information • Striping: simply spreading data over multiple disks • RAID: redundant array of inexpensive/independent disks

RAID

RAID

centralized sharedmemory eight or fewer cores

centralized sharedmemory eight or fewer cores

centralized sharedmemory Share a single centralized memory All processors have equal access to

centralized sharedmemory Share a single centralized memory All processors have equal access to

centralized sharedmemory All processors have uniform latency from memory Uniform memory access (UMA) multiprocessors

centralized sharedmemory All processors have uniform latency from memory Uniform memory access (UMA) multiprocessors

distributed shared memory more processors physically distributed memory

distributed shared memory more processors physically distributed memory

distributed shared memory more processors physically distributed memory Distributing mem among the nodes increases

distributed shared memory more processors physically distributed memory Distributing mem among the nodes increases bandwidth & reduces local-mem latency

distributed shared memory more processors physically distributed memory NUMA: nonuniform memory access time depends

distributed shared memory more processors physically distributed memory NUMA: nonuniform memory access time depends on data word loc in mem

distributed shared memory more processors physically distributed memory Disadvantages: more complex inter-processor communication more

distributed shared memory more processors physically distributed memory Disadvantages: more complex inter-processor communication more complex software to handle distributed mem

Cache Coherence Problem • A memory system is Coherent if any read of a

Cache Coherence Problem • A memory system is Coherent if any read of a data item returns the most recently written value of that data item • Two critical aspects coherence: defines what values can be returned by a read consistency: determines when a written value will be returned by a read

Teaching Components • Lectures • Labs • Research

Teaching Components • Lectures • Labs • Research

Labs • 6 lab sessions • Pipeline implementation • Cache implementation

Labs • 6 lab sessions • Pipeline implementation • Cache implementation

Labs • Lab 1 warmup Spartan 3 E and ISE environment; update verilog code

Labs • Lab 1 warmup Spartan 3 E and ISE environment; update verilog code of multi-cycle CPU to 3 E board; add one new branch instruction; reference code: Spartan 3 E Display: Spartan Simulation: http: //list. zju. edu. cn/kaibu/comparch/lab 1 -Spartan 3 E-Display. rar http: //list. zju. edu. cn/kaibu/comparch/spartansimulation. txt

Labs • Lab 2 implement 5 -stage pipelined CPU with 15 MIPS instructions; •

Labs • Lab 2 implement 5 -stage pipelined CPU with 15 MIPS instructions; • Lab 3 implement stall technique against pipelining hazards; • Lab 4 implement forwarding paths toward faster CPU; • Lab 5 implement a pipelined CPU with 31 MIPS instructions; use predict-not-taken policy to solve control hazard; • Lab 6

Labs • Call for volunteer lab assistants help tutor & check the demo during

Labs • Call for volunteer lab assistants help tutor & check the demo during lab sessions; get bonus credit;

Teaching Components • Lectures • Labs • Research

Teaching Components • Lectures • Labs • Research

Why do you care?

Why do you care?

Waive lab demos&reports

Waive lab demos&reports

More than that?

More than that?

Learn to learn things differently

Learn to learn things differently

Know not only how but also why

Know not only how but also why

Read this book and you’ll see Operating Systems: Three Easy Pieces http: //pages. cs.

Read this book and you’ll see Operating Systems: Three Easy Pieces http: //pages. cs. wisc. edu/~remzi/OSTEP/

Research Warm-up • Requirements 1. Find a research topic you are interested in: e.

Research Warm-up • Requirements 1. Find a research topic you are interested in: e. g. , computer architecture, computer network, network security; 2. Read latest papers from recent top conferences; http: //www. ccf. org. cn/sites/ccf/paiming. jsp 3. Write a report/paper and prepare a presentation.

Research Warm-up • Notes: The paper should cover 1. What is the research problem?

Research Warm-up • Notes: The paper should cover 1. What is the research problem? 2. Why is it important? 3. What are the solutions? 4. Any limitations? 5. What would you do? 6. What do the experiments show? More on http: //list. zju. edu. cn/kaibu/compar ch 2016/research. html

Call for Research Interns • and SRTP/FYP advisees • Networking, Security: RFID, SDN, etc.

Call for Research Interns • and SRTP/FYP advisees • Networking, Security: RFID, SDN, etc. more on http: //list. zju. edu. cn/kaibu/publication. html • And, of course, what else you are interested in

Grade?

Grade?

4% Grading hw+quiz by research warmup Class participation & performance 16% Homework 8% Pop

4% Grading hw+quiz by research warmup Class participation & performance 16% Homework 8% Pop quiz 32% Lab assignments 40% Final exam (closed-book + memo) Bonus 5% Research Warm-up Active class participation

How will I teach?

How will I teach?

What Students Expect from Teachers • Fun • Humor • Expertise • Easy exam

What Students Expect from Teachers • Fun • Humor • Expertise • Easy exam • High grades • …

I wish I knew someone like this, too…

I wish I knew someone like this, too…

Teaching Plan • Keep it Simple • Focus on the core concepts • Try

Teaching Plan • Keep it Simple • Focus on the core concepts • Try to help you more easily understand

#What’s More to Share helpful/inspiring resources #The 3 Secrets of Highly Successful Graduates by

#What’s More to Share helpful/inspiring resources #The 3 Secrets of Highly Successful Graduates by Reid Hoffman

How will you contribute?

How will you contribute?

Thanks In Advance • Study group • Lab assistants • Research interns • …

Thanks In Advance • Study group • Lab assistants • Research interns • … • AT LEAST submit assignments & lab reports show up to final exam

QQ Group: 533944879

QQ Group: 533944879

Who’s Who

Who’s Who

Ready?

Ready?

#The 3 Secrets of Highly Successful Graduates

#The 3 Secrets of Highly Successful Graduates