Delayed Load Datorteknik Delayed Load bild 1 All
Delayed Load Datorteknik Delayed. Load bild 1
All problems solved? NO, what will happen if. . . lw $6 $0($1) add $4 $6 $1 add $7 $6 $2 Datorteknik Delayed. Load bild 2
Critical path “DM” to “EX” ? 0 x 30 lw $6 $0($1) 0 x 34 add $4 $6 $1 0 x 38 add $7 $6 $2 IM Reg IM DM Reg IM Reg DM Reg Datorteknik Delayed. Load bild 3
The Model We Use Zero ext. = = Branch logic 0 A ALU 4 B + = Sgn/Ze extend 31 = + Datorteknik Delayed. Load bild 4
Critical path ALU? Zero ext. = = Branch logic 0 A ALU 4 B + = Sgn/Ze extend 31 = + Datorteknik Delayed. Load bild 5
Zero ext. = = Branch logic 0 A ALU 4 B + = Sgn/Ze extend 31 = + Datorteknik Delayed. Load bild 6
Critical path DATA MEMORY? Zero ext. = = Branch logic 0 A ALU 4 B + = Sgn/Ze extend 31 = + Datorteknik Delayed. Load bild 7
Critical path ALU + DATA MEMORY? Zero ext. = = Branch logic 0 A ALU 4 B + = Sgn/Ze extend 31 = + Datorteknik Delayed. Load bild 8
Fix or Not? The Critical path would be 2 T (ALU+DM) Clockspeed only half WE CHOOSE NOT TO FIX Datorteknik Delayed. Load bild 9
Delayed Load One “delayed load” slot – – lw $6 $0($1) other useful operation, or nop add $4 $6 $1 add $7 $6 $4 Still better than NO forward – – – – lw $6 $0($1) other useful operation, or nop add $4 $6 $1 other useful operation, or nop add $7 $6 $4 Datorteknik Delayed. Load bild 10
Pipeline Efficiency Critical path cut to 1/4 Can we do the same with only three stages? Datorteknik Delayed. Load bild 11
4 Stage Pipe IM Reg DM Reg 3 Stage Pipe IM Reg DM Datorteknik Delayed. Load bild 12
4 Stage Pipe Zero ext. = = Branch logic 0 A ALU 4 B + = Sgn/Ze extend 31 = + Datorteknik Delayed. Load bild 13
3 Stage Pipe Zero ext. = = Branch logic 0 A ALU 4 B + = Sgn/Ze extend 31 = + Datorteknik Delayed. Load bild 14
What about the instruction set? lw $t 2 4($t 4)? NO, ALU is not in path lw $t 2 $t 4? OK, No need for ALU Datorteknik Delayed. Load bild 15
Avoid Delayed Load? Yes, by moving DM to EX, we can forward the result Datorteknik Delayed. Load bild 16
Different Pipelength/depth Is it possible to implement both version in one structure (MIPS pipe). NO! There might be collisions, both EX, and DM accesses memory at the same time. Datorteknik Delayed. Load bild 17
Pipeline Efficiency Did we change the critical path? NO!, ALU and DM are not in sequence Datorteknik Delayed. Load bild 18
- Slides: 18