1013 Lecture Topics Data Hazards Control Hazards Grading

  • Slides: 26
Download presentation
10/13: Lecture Topics • Data Hazards • Control Hazards

10/13: Lecture Topics • Data Hazards • Control Hazards

Grading Disputes • Bring grading disputes to me up to one week after an

Grading Disputes • Bring grading disputes to me up to one week after an assignment is handed back • We try to be very fair when grading • Please don’t beg for one point here or one point there – each hw counts 5% of your grade – hw’s are out of ~70 points – each hw point is only 0. 07% of your final grade (or 0. 0028 grade points) – I will add 0. 028 to everyone’s final grade if you don’t dispute 1 or 2 point grading issues • Exams are worth more and I will be more tolerant of begging

Pipelined Xput and Latency 1 2 3 4 5 6 7 8 IF ID

Pipelined Xput and Latency 1 2 3 4 5 6 7 8 IF ID EX MEM WB IF ID EX MEM 9 inst 1 inst 2 inst 3 inst 4 WB inst 5 • What’s the throughput of this implementation? • What’s the latency of this implementation?

Data Hazards • What happens in the following code? add $s 0, $s 1,

Data Hazards • What happens in the following code? add $s 0, $s 1, $s 2 add $s 4, $s 3, $s 0 IF ID EX MEM IF ID $s 0 is read here WB EX MEM WB $s 0 is written here • This is called as a data dependency • When it causes a pipeline stall it is called a data hazard

Solution: Forwarding • The value of $s 0 is known after cycle 3 (after

Solution: Forwarding • The value of $s 0 is known after cycle 3 (after the first instruction’s EX stage) • The value of $s 0 isn’t needed until cycle 4 (before the second instruction’s EX stage) • If we forward the result there isn’t a stall add s 0, s 1, s 2 add s 4, s 3, s 0 IF ID EX MEM IF ID WB EX MEM WB

Another data hazard • What if the first instruction is lw? lw s 0,

Another data hazard • What if the first instruction is lw? lw s 0, 0(s 2) add s 4, s 3, s 0 IF ID EX MEM IF ID WB EX MEM WB • s 0 isn’t known until after the MEM stage • We can’t forward back into the past • Either stall or reorder instructions

Solutions to the lw hazard • We can stall for one cycle, but we

Solutions to the lw hazard • We can stall for one cycle, but we hate to stall lw s 0, 0(s 2) IF add s 4, s 3, s 0 ID EX MEM IF ID stall WB EX MEM WB • Try to execute an unrelated instruction between the two instructions lw s 0, 0(s 2) sub t 4, t 2, t 3 add s 4, s 3, s 0 sub t 4, t 2, t 3 IF ID EX MEM WB IF ID EX MEM IF ID WB EX MEM WB

Reordering Instructions • Reordering instructions is a common technique for avoiding pipeline stalls •

Reordering Instructions • Reordering instructions is a common technique for avoiding pipeline stalls • Sometimes the compiler does the reordering statically • Almost all modern processors do this reordering dynamically – they can see several instructions and they execute anyone that has no dependency – this is known as out-of-order execution and is very complicated to implement

Structural Hazards • Instructions in different stages want to use the same resource –

Structural Hazards • Instructions in different stages want to use the same resource – Suppose a lw instruction is in stage four (memory access) – Meanwhile, an add instruction is in stage one (instruction fetch) – Both of these actions require access to memory; they could collide • Add more hardware to eliminate the problem • Or stall (cheaper & easier), not usually done

Control Hazards • Branch instructions cause control hazards (aka branch hazards) because we don’t

Control Hazards • Branch instructions cause control hazards (aka branch hazards) because we don’t know which instruction to execute next bne $s 0, $s 1, next add $s 4, $s 3, $s 0. . . IF ID EX MEM WB IF ID EX MEM do we fetch add or sub? we don’t know until here next: sub $s 4, $s 3, $s 0 WB

Solution: Stall • We can stall to see which instruction to execute next bne

Solution: Stall • We can stall to see which instruction to execute next bne $s 0, $s 1, next sub $s 4, $s 3, $s 0 IF ID EX stall • But we hate to stall MEM WB IF ID EX MEM WB

Solution: Move Branch to ID • Move the branch hardware to ID stage –

Solution: Move Branch to ID • Move the branch hardware to ID stage – Hardware to compare to registers is simpler than hardware to add them (i. e. EX stage hardware) bne $s 0, $s 1, next sub $s 4, $s 3, $s 0 IF ID EX stall IF MEM WB ID EX MEM WB • We still have to stall for one cycle • But we can’t move the branch up any more

Branch Delay Slot • A branch now causes a stall of one cycle •

Branch Delay Slot • A branch now causes a stall of one cycle • Try to execute an instruction instead of stall • The compiler must find an instruction to fill the branch delay slot – 50% of the instructions are useful – 50% are nop’s (no ops) which don’t do anything • Might have been a good idea originally but not any more

Branch Delay Slot Example • “addi $t 0, 1” will always execute move bne

Branch Delay Slot Example • “addi $t 0, 1” will always execute move bne addi Done: move branch not taken move bne addi move $t 0, $zero $s 0, $zero, Done $t 0, 1 $t 0, 3 $t 1, $t 0 branch taken move bne addi move $t 0, $zero $s 0, $zero, Done $t 0, 1 $t 1, $t 0

Solution: Speculate • Executing the following instructions assuming the branch is taken (or not

Solution: Speculate • Executing the following instructions assuming the branch is taken (or not taken) • If we guessed right, then let the instructions proceed • If we guessed wrong, then squash the partially completed instructions. – This is called flushing the pipeline. – These instructions were wasted, but we would have stalled otherwise • Never let a speculating instruction write to memory or a register until we’re sure it should execute • This is known as speculative execution

Speculate Never Taken • Assume the branch isn’t taken and fetch the next instruction

Speculate Never Taken • Assume the branch isn’t taken and fetch the next instruction bne addi Done: move Branch not taken bne IF ID addi IF addi $s 0, $zero, Done $t 0, 1 $t 0, 3 $t 1, $t 0 Branch taken bne EX MEM WB ID EX MEM WB IF ID EX MEM IF addi WB ID EX IF SQUASH move IF MEM ID WB EX MEM • Predicting taken is actually better, but still not good enough WB

Static Branch Prediction • Most backwards branch are taken (80%) – they are part

Static Branch Prediction • Most backwards branch are taken (80%) – they are part of loops • Forward branches are taken about half the time – if statements • A common static branch prediction scheme is to predict – backwards branches are taken – forward branches are not taken • Some architectures allow the compiler to specify in the branch instruction to predict taken or not taken • This does okay (70 -80%), but still not good enough

Dynamic Branch Prediction • In most programs you execute the same instructions over and

Dynamic Branch Prediction • In most programs you execute the same instructions over and over • You encounter the same branch instructions over and over • The same branch instruction is usually – taken if it was taken last time – not taken if it was not taken last time • If we keep a history of each branch instruction, then we can predict much better

Dynamic Branch Prediction • A table is kept on the CPU that • There

Dynamic Branch Prediction • A table is kept on the CPU that • There is not room to store each instruction – last few bits of the instruction index this table – some instructions collide like a hash table – usually store 2 bits per entry • Dynamic branch prediction is 92 -98% accurate Instruction. . . Taken last time? . . . Predict. . . 0 x 10001234 no not taken 0 x 102 F 0268 yes taken 0 x 13 D 0122 C no not taken . .

Importance of Branch Prediction • Branches occur every five instructions • Today’s processors execute

Importance of Branch Prediction • Branches occur every five instructions • Today’s processors execute up to 4 instructions per cycle • A branch occurs every 2 cycles • Pipelines are longer than MIPS (8, 9, 11, 13 cycles) – branch mispredict penalty is 3 -5 cycles instead of 1 cycle • Must predict accurately or you execute < 0. 5 instructions per cycle instead of 4 instructions

Exceptions and Interrupts • So far, we’ve assumed that the assembled code can always

Exceptions and Interrupts • So far, we’ve assumed that the assembled code can always be executed • Lots of ways for unexpected things to happen: – Undefined instruction – Arithmetic overflow – System call – I/O device request

Exceptions • An exception is an internal event – The unexpected condition was caused

Exceptions • An exception is an internal event – The unexpected condition was caused by something the program did – Undefined instructions and arithmetic overflows are examples – If you ran the program again, the exception would (probably) happen again at the same point in the program’s execution

Interrupts • An interrupt is an external event – The unexpected condition was not

Interrupts • An interrupt is an external event – The unexpected condition was not caused by the program – An I/O device request is an example – If you ran the program again, the interrupt would probably not happen at the same point

What should happen? • These events result in an unnatural change in the flow

What should happen? • These events result in an unnatural change in the flow of control • Normally, the next instruction executed is ____ • When one of these events takes place, something else happens – The system must respond to the event – The response depends on the type of event

Exception Handling • Loosely, the following steps are taken: 1. Save the address of

Exception Handling • Loosely, the following steps are taken: 1. Save the address of the offending instruction in a register 2. Make the reason for the exception known - Set the value of the status register, or - Use vectored interrupts to do step 3 3. Transfer control to the operating system 4. Operating system decides what to do: - May report the error to the user - May terminate the program

Exception/Pipelining Interface • Suppose an add instruction overflows, causing an exception • Instructions after

Exception/Pipelining Interface • Suppose an add instruction overflows, causing an exception • Instructions after the add are already in the pipeline – The partially computed instructions must be flushed • Exception must be caught before register contents have changed – Pipeline designers must be wary of exception handling