Memory Hakim Weatherspoon CS 3410 Spring 2012 Computer

  • Slides: 41
Download presentation
Memory Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See: P&H Appendix

Memory Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See: P&H Appendix C. 8, C. 9

Big Picture: Building a Processor memory inst +4 register file +4 =? PC control

Big Picture: Building a Processor memory inst +4 register file +4 =? PC control offset new pc alu target imm cmp addr din dout memory extend A Single cycle processor 2

Goals for today Review • Finite State Machines Memory • • Register Files Tri-state

Goals for today Review • Finite State Machines Memory • • Register Files Tri-state devices SRAM (Static RAM—random access memory) DRAM (Dynamic RAM) 3

Which statement(s) is true (A) In a Moore Machine output depends on both current

Which statement(s) is true (A) In a Moore Machine output depends on both current state and input (B) In a Mealy Machine output depends on current state and input (C) In a Mealy Machine output depends on next state and input (D) All the above are true (E) None are true 4

Mealy Machine Registers General Case: Mealy Machine Current State Input Comb. Logic Output Next

Mealy Machine Registers General Case: Mealy Machine Current State Input Comb. Logic Output Next State Outputs and next state depend on both current state and input 5

Moore Machine Registers Special Case: Moore Machine Current State Comb. Logic Output Input Comb.

Moore Machine Registers Special Case: Moore Machine Current State Comb. Logic Output Input Comb. Logic Next State Outputs depend only on current state 6

Goals for today Review • Finite State Machines Memory • • Register Files Tri-state

Goals for today Review • Finite State Machines Memory • • Register Files Tri-state devices SRAM (Static RAM—random access memory) DRAM (Dynamic RAM) 7

Example: Digital Door Lock Inputs: • keycodes from keypad • clock Outputs: • “unlock”

Example: Digital Door Lock Inputs: • keycodes from keypad • clock Outputs: • “unlock” signal • display how many keys pressed so far 8

Door Lock: Inputs Assumptions: • signals are synchronized to clock • Password is B-A-B

Door Lock: Inputs Assumptions: • signals are synchronized to clock • Password is B-A-B K A B K 0 1 1 A 0 1 0 B 0 0 1 Meaning Ø (no key) ‘A’ pressed ‘B’ pressed 9

Door Lock: Outputs Assumptions: • High pulse on U unlocks door D 3 D

Door Lock: Outputs Assumptions: • High pulse on U unlocks door D 3 D 2 D 1 D 0 4 LED 8 dec U 10

Door Lock: Simplified State Diagram Ø Ø G 1 ” 1” “A” “B” G

Door Lock: Simplified State Diagram Ø Ø G 1 ” 1” “A” “B” G 2 ” 2” else “B” else G 3 ” 3”, U any Idle ” 0” Ø else any B 1 ” 1” else B 2 ” 2” Ø else B 3 ” 3” Ø 11

Door Lock: Simplified State Diagram Ø Ø G 1 ” 1” else “B” Idle

Door Lock: Simplified State Diagram Ø Ø G 1 ” 1” else “B” Idle ” 0” Ø “A” G 2 ” 2” else “B” G 3 ” 3”, U any else B 1 ” 1” else B 2 ” 2” Ø Ø 12

Door Lock: Simplified State Diagram Ø Ø G 1 ” 1” else “B” Idle

Door Lock: Simplified State Diagram Ø Ø G 1 ” 1” else “B” Idle ” 0” Ø “A” G 2 ” 2” else B 1 ” 1” else B 2 ” 2” Ø Ø “B” G 3 ” 3”, U Cur. any Output State Idle “ 0” G 1 “ 1” G 2 “ 2” G 3 “ 3”, U B 1 “ 1” B 2 “ 2” 13

Door Lock: Simplified State Diagram Ø Ø G 1 ” 1” else “B” Idle

Door Lock: Simplified State Diagram Ø Ø G 1 ” 1” else “B” Idle ” 0” Ø “A” else B 1 ” 1” else Ø Cur. State G 2 Idle “B” ” 2”Idle else. Idle G 1 G 1 G 2 G 2 B 2 G 3 ” 2” B 1 ØB 2 Input Next State Ø G 3 Idle ” 3”, U “B” G 1 “A” B 1 any Ø G 1 “A” G 2 “B” B 2 Ø B 2 “B” G 3 “A” Idle any Idle Ø B 1 K B 2 Ø B 2 K Idle 14

State Table Encoding SCur. SState S 0 D 3 2 1 0 Idle 0

State Table Encoding SCur. SState S 0 D 3 2 1 0 Idle 0 0 G 1 0 0 G 2 1 0 0 0 G 3 1 1 0 1 B 1 0 0 0 1 B 2 0 1 0 State K D 3 D 2 D 1 AD 0 B 0 Idle 0 U 0 1 G 1 1 0 1 G 2 0 1 G 3 B 1 B 2 DOutput 2 D 1 D 0 0 “ 0” 0 0 0 “ 1” 0 1 0 “ 2” 1 0 0“ 3”, 1 U 1 0 “ 1” 0 1 0 “ 2” 1 0 U 0 0 0 1 0 0 4 Meaning S 2 S 1 8 S 0 dec 0 0 Ø 0(no key) 0 1 ‘A’ 0 pressed K 0 0 1 ‘B’ pressed A 0 1 1 B 1 0 0 1 Cur. S 2 SState 1 S 0 0 Idle 0 0 0 G 1 0 1 0 G 2 1 0 0 G 3 1 1 1 B 1 0 0 1 B 2 0 1 K Input A B 0 Ø 0 0 1 “B” 0 1 1 “A” 1 0 0 Ø 0 0 1 “A” 1 0 1 “B” 0 1 0 Ø 0 0 1 “B” 0 1 1 “A” 1 0 x any x x 0 Ø 0 0 1 K x x Next S’ 2 S’State 1 S’ 0 0 Idle 0 0 0 G 1 0 1 1 B 1 0 0 0 G 1 0 G 2 1 0 1 B 2 0 1 0 B 2 1 0 0 G 3 1 1 0 Idle 0 0 1 B 1 0 0 1 B 2 0 1 0 Idle 0 0 15

3 bit Reg S 2 -0 D 3 -0 U clk S 2 -0

3 bit Reg S 2 -0 D 3 -0 U clk S 2 -0 K A Strategy: B S’ 2 -0 4 dec Door Lock: Implementation SCur. SState S 0 D 3 2 1 0 Idle 0 0 G 1 0 0 G 2 1 0 0 0 G 3 1 1 0 1 B 1 0 0 0 1 B 2 0 1 0 DOutput 2 D 1 D 0 0 “ 0” 0 0 0 “ 1” 0 1 0 “ 2” 1 0 0“ 3”, 1 U 1 0 “ 1” 0 1 0 “ 2” 1 0 U 0 0 0 1 0 0 (1) Draw a state diagram (e. g. Moore Machine) (2) Write output and next-state tables (3) Encode states, inputs, and outputs as bits (4) Determine logic equations for next state and outputs 16

Door Lock: Implementation 4 dec Cur. K Input A B S 2 SState 1

Door Lock: Implementation 4 dec Cur. K Input A B S 2 SState 1 S 0 D 3 -0 Ø 0 0 0 Idle 0 0 0 S 2 -0 3 bit Reg 0 1 0 Idle 0 0 1 “B” U 1 0 0 Idle 0 0 1 “A” clk Ø 0 0 0 G 1 0 1 1 “A” S 2 -0 0 1 0 G 1 0 1 1 “B” K S’ 2 -0 Ø 0 0 0 G 2 1 0 0 A 0 1 0 G 2 1 0 1 “B” B 1 0 0 G 2 1 0 1 “A” x x 0 G 3 1 1 x any Strategy: Ø 0 0 1 B 1 0 0 0 (1) Draw a state diagram (e. g. Moore Machine) (2) Write output and next-state tables K x x 1 B 1 0 0 1 (3) Encode states, inputs, and outputs as bits 1 B 2 Ø 0 0 0 1 0 (4) Determine logic equations for next state and outputs K x x 1 B 2 0 1 1 Next S’ 2 S’State 1 S’ 0 0 Idle 0 0 0 G 1 0 1 1 B 1 0 0 0 G 1 0 G 2 1 0 1 B 2 0 1 0 B 2 1 0 0 G 3 1 1 0 Idle 0 0 1 B 1 0 0 1 B 2 0 1 0 Idle 0 0 17

Administrivia Make sure partner in same Lab Section this week Lab 2 is out

Administrivia Make sure partner in same Lab Section this week Lab 2 is out Due in one week, next Monday, start early Work alone Save your work! • Save often. Verify file is non-zero. Periodically save to Dropbox, email. • Beware of Mac. OSX 10. 5 (leopard) and 10. 6 (snow-leopard) Use your resources • Lab Section, Piazza. com, Office Hours, Homework Help Session, • Class notes, book, Sections, CSUGLab No Homework this week 18

Administrivia Check online syllabus/schedule • http: //www. cs. cornell. edu/Courses/CS 3410/2012 sp/schedule. html Slides

Administrivia Check online syllabus/schedule • http: //www. cs. cornell. edu/Courses/CS 3410/2012 sp/schedule. html Slides and Reading for lectures Office Hours Homework and Programming Assignments Prelims (in evenings): • Tuesday, February 28 th • Thursday, March 29 th • Thursday, April 26 th Schedule is subject to change 19

Collaboration, Late, Re-grading Policies “Black Board” Collaboration Policy • Can discuss approach together on

Collaboration, Late, Re-grading Policies “Black Board” Collaboration Policy • Can discuss approach together on a “black board” • Leave and write up solution independently • Do not copy solutions Late Policy • Each person has a total of four “slip days” • Max of two slip days for any individual assignment • Slip days deducted first for any late assignment, cannot selectively apply slip days • For projects, slip days are deducted from all partners • 20% deducted per day late after slip days are exhausted Regrade policy • Submit written request to lead TA, and lead TA will pick a different grader • Submit another written request, lead TA will regrade directly • Submit yet another written request for professor to regrade. 20

Goals for today Review • Finite State Machines Memory • • Register Files Tri-state

Goals for today Review • Finite State Machines Memory • • Register Files Tri-state devices SRAM (Static RAM—random access memory) DRAM (Dynamic RAM) 21

Register File • N read/write registers • Indexed by register number 32 Implementation: •

Register File • N read/write registers • Indexed by register number 32 Implementation: • D flip flops to store bits • Decoder for each write port • Mux for each read port DW Dual-Read-Port QA Single-Write-Port Q B 32 x 32 Register File W 1 32 32 RW RA RB 5 5 5 22

Register File • N read/write registers • Indexed by register number 32 Implementation: •

Register File • N read/write registers • Indexed by register number 32 Implementation: • D flip flops to store bits • Decoder for each write port • Mux for each read port DW Dual-Read-Port QA Single-Write-Port Q B 32 x 32 Register File W 1 32 32 RW RA RB 5 5 5 23

Register File • N read/write registers • Indexed by register number 32 Implementation: •

Register File • N read/write registers • Indexed by register number 32 Implementation: • D flip flops to store bits • Decoder for each write port • Mux for each read port DW Dual-Read-Port QA Single-Write-Port Q B 32 x 32 Register File W 1 32 32 RW RA RB 5 5 5 24

Register File What happens if same • N read/write registers register read and •

Register File What happens if same • N read/write registers register read and • Indexed by writtend during same register number clock cycle? Implementation: • D flip flops to store bits • Decoder for each write port • Mux for each read port 25

Tradeoffs Register File tradeoffs + Very fast (a few gate delays for both read

Tradeoffs Register File tradeoffs + Very fast (a few gate delays for both read and write) + Adding extra ports is straightforward – Doesn’t scale 26

Building Large Memories Need a shared bus (or shared bit line) • Many FFs/outputs/etc.

Building Large Memories Need a shared bus (or shared bit line) • Many FFs/outputs/etc. connected to single wire • Only one output drives the bus at a time 27

Tri-State Devices Tri-State Buffers E D Q E 0 0 1 1 D Q

Tri-State Devices Tri-State Buffers E D Q E 0 0 1 1 D Q 0 z 1 z 0 0 1 1 E D Vdd D Q Gnd 28

Tri-State Devices Tri-State Buffers E D Q E 0 0 1 1 D Q

Tri-State Devices Tri-State Buffers E D Q E 0 0 1 1 D Q 0 z 1 z 0 0 1 1 E Vdd D Q Gnd 29

Shared Bus D 0 S 0 D 1 S 1 D 2 S 2

Shared Bus D 0 S 0 D 1 S 1 D 2 S 2 D 3 S 3 D 1023 S 1023 shared line 30

SRAM Static RAM (SRAM) • Essentially just SR Latches + tri-states buffers 31

SRAM Static RAM (SRAM) • Essentially just SR Latches + tri-states buffers 31

SRAM Static RAM (SRAM) • Essentially just SR Latches + tri-states buffers 4 x

SRAM Static RAM (SRAM) • Essentially just SR Latches + tri-states buffers 4 x 2 SRAM 32

SRAM Chip 33

SRAM Chip 33

row decoder SRAM Chip A 21 -10 A 9 -0 column selector, sense amp,

row decoder SRAM Chip A 21 -10 A 9 -0 column selector, sense amp, and I/O circuits Shared Data Bus CS R/W 34

Typical SRAM Cell B bit line SRAM Cell word line B Each cell stores

Typical SRAM Cell B bit line SRAM Cell word line B Each cell stores one bit, and requires 4 – 8 transistors (6 is typical) Read: • pre-charge B and B to Vdd/2 • pull word line high • cell pulls B or B low, sense amp detects voltage difference Write: • pull word line high • drive B and B to flip cell 35

SRAM Modules and Arrays 1 M x 4 SRAM R/W A 21 -0 CS

SRAM Modules and Arrays 1 M x 4 SRAM R/W A 21 -0 CS msb lsb Bank 2 CS Bank 3 CS Bank 4 CS 36

SRAM Summary SRAM • A few transistors (~6) per cell • Used for working

SRAM Summary SRAM • A few transistors (~6) per cell • Used for working memory (caches) • But for even higher density… 37

Dynamic-RAM (DRAM) • Data values require constant refresh bit line Dynamic RAM: DRAM word

Dynamic-RAM (DRAM) • Data values require constant refresh bit line Dynamic RAM: DRAM word line Capacitor Gnd 38

DRAM vs. SRAM Single transistor vs. many gates • Denser, cheaper ($30/1 GB vs.

DRAM vs. SRAM Single transistor vs. many gates • Denser, cheaper ($30/1 GB vs. $30/2 MB) • But more complicated, and has analog sensing Also needs refresh • • Read and write back… …every few milliseconds Organized in 2 D grid, so can do rows at a time Chip can do refresh internally Hence… slower and energy inefficient 39

Memory Register File tradeoffs + + – – Very fast (a few gate delays

Memory Register File tradeoffs + + – – Very fast (a few gate delays for both read and write) Adding extra ports is straightforward Expensive, doesn’t scale Volatile Memory alternatives: SRAM, DRAM, … – Slower + Cheaper, and scales well – Volatile Non-Volatile Memory (NV-RAM): Flash, EEPROM, … + Scales well – Limited lifetime; degrades after 100000 to 1 M writes 40

Summary We now have enough building blocks to build machines that can perform non-trivial

Summary We now have enough building blocks to build machines that can perform non-trivial computational tasks Register File: Tens of words of working memory SRAM: Millions of words of working memory DRAM: Billions of words of working memory NVRAM: long term storage (usb fob, solid state disks, BIOS, …) Next time we will build a simple processor! 41