6 175 Constructive Computer Architecture Tutorial 5 Epochs
- Slides: 20
6. 175: Constructive Computer Architecture Tutorial 5 Epochs, Debugging, and Caches Quan Nguyen (Troubled by the two biggest problems in computer science… and Comic Sans) October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -1
Agenda • Epochs: a review • Debugging your processor ft. Piazza • Caches: a primer October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -2
Review: 1 -bit Distributed Epochs Delay: 0 cycle fe. Ep = 0 fd. Ep = 0 PC • • • Delay: 100 cycles d. Ep = 0 Inst 1 redirect, ie. Ep = 0 de. Ep = 1 Fetch f 2 d Decode e. Ep = 0 d 2 e Execute Inst 2. . . Decode redirects Inst 1 (ie. Ep = id. Ep = 0) Execute redirects Inst 1 Correct-path Inst 2 (ie. Ep = 1, id. Ep = 0) issues Execute redirects Inst 2 Inst 1 redirect arrives at Fetch (ie. Ep == fe. Ep) n October 28, 2016 change PC to a wrong value http: //csg. csail. mit. edu/6. 175 T 05 -3
Review: Unbounded Global Epochs e. Epoch d. Epoch redirect PC Redirect redirect PC miss pred? PC Fetch f 2 d Decode d 2 e Execute . . . • Both Decode and Execute can redirect the PC n Execute redirect should never be overruled • Global epoch for each redirecting stage n n n October 28, 2016 e. Epoch: incremented when redirect from Execute takes effect d. Epoch: incremented when redirect from Decode takes effect Initially set all epochs to 0 http: //csg. csail. mit. edu/6. 175 T 05 -4
Review: Branch History Table (BHT) from Fetch Instruction Opcode Fetch PC offset 0 0 k + Branch? BHT Index Target PC 2 k-entry BHT, 2 bits/entry At the Decode stage, if the instruction is a branch then BHT is consulted using the pc; if BHT shows a different prediction than the incoming ppc, Fetch is redirected Taken/¬Taken? 4 K-entry BHT, 2 bits/entry, ~80 -90% correct direction predictions October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -5
Review: Two-Level Branch Predictor Pentium Pro uses the result from the last two branches to select one of the four sets of BHT bits (~95% correct) 00 Fetch PC k Four 2 k, 2 -bit Entry BHT 2 -bit global branch history shift register Shift in Taken/¬Taken results of each branch Taken/¬Taken? October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -6
Review: Tournament Predictor “The Alpha 21264 Microprocessor Architecture” • 10 -bit PC: index 1024 x 10 -bit local history table • 10 -bit local history n Index 1024 x 3 -bit BHT: prediction 1 • 12 -bit global history n n October 28, 2016 Index 4096 x 2 -bit BHT: prediction 2 Index 4096 x 2 -bit BHT: select between predictions 1, 2 http: //csg. csail. mit. edu/6. 175 T 05 -7
Debugging Your Processor October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -8
Unsupported Instruction • Processor initialized? (csrf. started) • What could be redirecting your PC? • • • October 28, 2016 Faulty branch address calculation? BTB? (Lab 6 hint!) Bad instruction? (unlikely for this course) http: //csg. csail. mit. edu/6. 175 T 05 -9
Processor Hangs • Rules conflict? • • Check schedule (option “-show-schedule”) Use $display() statements to diagnose • Did you size pipeline FIFOs correctly? • Are FIFOs being drained? October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -10
Incorrect Behavior • Which test fails? • Where is it in the dump? • Do you have a log? • October 28, 2016 If your rules don’t fire, temporarily make simpler rules http: //csg. csail. mit. edu/6. 175 T 05 -11
Demo: our code src/Two. Stage. bsv rule do. Fetch (csrf. started); // fetch Data inst = i. Mem. req(pc. Reg[0]); // Addr pred. Pc = btb. pred. Pc(pc. Reg[0]); Addr pred. Pc = pc. Reg[0]; // always predict PC to be next PC. . . endrule [qmn@vlsifarm] $. /run_asm. sh twostage. . . -- assembly test: lw -ERROR: Executing unsupported instruction at pc: 00001000. Exiting ^C October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -12
Demo: the log scemi/sim/logs/lw. log . . . Cycle 5 --------------------------Fetch: PC = 00000208, inst = 0000 a 183, expanded = lw r 3 = [r 1 0 x 0] Execute finds misprediction: PC = 00000208 Fetch: Mispredict, redirected by Execute Cycle 6 --------------------------Fetch: PC = 00001000, inst = 00 ff, expanded = unsupport 0 x 00 ff Execute: Kill instruction October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -13
Demo: the dump programs/assembly/build/assembly/dump/lw. dump Disassembly of section. text: 00000200 <_start>: 200: 000010 b 7 204: 00008093 208: 0000 a 183. . . lui mv lw x 1, 0 x 1 x 1, x 1 x 31, 0(x 1) # 1000 Disassembly of section. data: 00001000 <begin_signature>: 1000: 00 ff 0 xff 1002: 00 ff 0 xff October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -14
Demo: back to the codesrc/Two. Stage. bsv rule do. Execute (csrf. started); if (e. Inst. mispredict) begin $display("Execute finds misprediction: PC = %x”, f 2 e. pc); exe. Redirect[0] <= Valid(Exe. Redirect{ pc: f 2 e. pc, next. Pc: e. Inst. addr }); endrule • What sets e. Inst. addr? • What happens to exe. Redirect[0]? October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -15
Caches October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -16
f. Epoch Multistage Pipeline nap PC redirect Register File e. Epoch e 2 c Execute Decode d 2 e Inst Memory scoreboard Data Memory The use of magic memories (combinational reads) makes these designs unrealistic October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -17
Magic Memory Model Write. Enable Clock Address Write. Data MAGIC RAM Read. Data • Reads and writes are always completed in one cycle n n n a Read can be done any time (i. e. combinational) If enabled, a Write is performed at the rising clock edge (the write address and data must be stable at the clock edge) In a real DRAM the data will be available several cycles after the address is supplied October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -18
Memory Hierarchy CPU Reg. File Small, Fast Memory SRAM Big, Slow Memory DRAM holds frequently used data size: latency: bandwidth: Reg. File << SRAM << DRAM on-chip >> off-chip why? On a data access: hit (data Î fast memory) low latency access miss (data Ï fast memory) long latency access (DRAM) October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -19
Two biggest problems in CS • Cache invalidation § How to inform caches of stale data • Naming things • Off-by-one errors October 28, 2016 http: //csg. csail. mit. edu/6. 175 T 05 -20
- Constructive proof vs non constructive
- Constructive proof vs non constructive
- Constructive proof vs non constructive
- Constructive proof vs non constructive
- Computer architecture tutorial
- Bid rent theory
- Epochs in tertiary period
- Borcherts epochs
- European union definition ap human geography
- Bus architecture in computer organization
- Computer architecture vs organization
- Basic computer organisation and design
- 63 ün yüzde 40'ı kaçtır
- Gezang 175
- Rock physics modelling
- Real decreto 175/2001
- Riveted bolted and welded connection
- Piltuvo pavidalo dauba
- K map half adder
- Fixed point iteration method
- Math 175