Simulation Of a Perceptron Based Branch Prediction Course
Simulation Of a Perceptron Based Branch Prediction Course - EEL 5764 Team - Ashok and Siddhartha Perceptron Based Branch Prediction
AGENDA o o o o o Why BP Why Perceptron Algorithm Methodology Test Configuration Benchmarks Results (Partial) References Improvements Perceptron Based Branch Prediction 2
Why BP o Increase in mis-prediction Latency due to n n n Deeper pipelines Multiple issue with Speculation Increased IPC Perceptron Based Branch Prediction 3
Why Perceptron o Most BP ideas reduce destructive aliasing n o Can Afford More history length n o o (Ex Agree, YAGS, bi-mod etc) Unlike Exponential increase in two level Easy to implement in hardware Easy to Understand Perceptron Based Branch Prediction 4
Perceptron Based Branch Prediction 5
Perceptron (cont. ) o o The perceptron assigns weights to each element of the branch history Prediction based on the dot-product of weights and the branch history n o Plus a bias weight (General Tendency) Branch history can be global, local n Or something more complex Perceptron Based Branch Prediction 6
Algorithm – Look Up o o o Produce a Hash with the Branch Address. Fetch the Perceptron from the table. Compute the output ‘y’ n o using ‘correct’ global history. Update Speculative Global History with the current prediction. n (y > 0 is Taken else Not Taken) Perceptron Based Branch Prediction 7
Algorithm – Update o o Update Global History with exact branch prediction information. If |y| > Threshold, Training is not needed. n o Threshold floor(1. 93 h + 14) Else, increment or decrement weights based on correct or incorrect prediction. Perceptron Based Branch Prediction 8
Methodology o o o Simple Scalar Version 3. 0 ANSI C Programming Language bpred. c/. h and sim-outorder. c Understand Data Flow Design and Implement the Perceptron Look up and Update functionality Thorough Testing and Analysis Perceptron Based Branch Prediction 9
Test Configuration Instruction Queue Fetch Size Mis-prediction latency Instruction Decode Width Instruction Issue Width Out-of-Order Execution Speculation (wrong path) - Integer ALU's Integer Mult Floating Point ALU's -4 -1 -4 L 1 Data Cache Hit Latency - 1 cycle - 10 cycles L 1 Data Cache Config L 2 Data Cache Config Memory Access Latency Memory Access Bus Width 8 3 4 4 yes <name>: <nsets>: <bsize>: <assoc>: <repl> - di 1 : 128 : 32 : 4 : l = LRU - di 1 : 1024 : 64 : 8 : l = LRU <first chunk> <second chunk> - 100 2 - 8 bytes Perceptron Based Branch Prediction 10
Benchmarks o o o VPR MCF Parser gcc More… Perceptron Based Branch Prediction 11
Results (Partial) Perceptron Based Branch Prediction 12
Results (cont…) o Gshare Memory requirement n o 64(only) Perceptron and 10 GHR (6 bit weights) n o o GHR Length – 17 Memory Req - 262144 bits Memory Req - 3840 bits Perceptron outperforms Gshare for the least configuration. Prediction will improve with increase in both GHR and number of perceptrons. Perceptron Based Branch Prediction 13
Improvements ? o Combined Perceptron Predictor n o Can we disregard some weights ? n o What is the optimal combined predictor that gives good prediction results and low delay. Use Selection Mask. Other Neural methods n To learn Linearly Inseparable Branches Perceptron Based Branch Prediction 14
Reference o DANIEL A. JIMENEZ, CALVIN LIN, Neural Methods for Dynamic Branch Prediction Perceptron Based Branch Prediction 15
Questions Thank you Perceptron Based Branch Prediction 16
- Slides: 16