6 375 Tutorial 4 RISCV and Final Projects
- Slides: 31
6. 375 Tutorial 4 RISC-V and Final Projects Ming Liu March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -1
Overview Branch Target Buffers RISC V Infrastructure Final Project March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -2
Two-stage Pipeline with BTB Fetch Decode-Register. Fetch-Execute-Memory-Write. Back Update BTB {PC, correct PC} BTB Predict Next PC PC kill f 2 d Register File misprediction correct pc Decode Execute Inst Memory Data Memory BTB: Branch Target Buffer At fetch: Use BTB to predict next PC At execute: Update BTB with correct next PC n March 4, 2016 Only if instruction is a branch (i. Type == J, Jr, Br) http: //csg. csail. mit. edu/6. 375 T 04 -3
Next Address Predictor: Branch Target Buffer (BTB) 2 k-entry direct-mapped BTB pc pci targeti valid • k i. Mem • Even small BTBs are effective match BTB remembers recent targets for a set of control instructions n n March 4, 2016 Fetch: looks for the pc and the associated target in BTB; if pc in not found then ppc is pc+4 Execute: checks prediction, if wrong kills the instruction and updates BTB (only for branches and jumps) http: //csg. csail. mit. edu/6. 375 T 04 -4
Next Addr Predictor interface Addr. Pred; method Addr nap(Addr pc); method Action update(Redirect rd); endinterface • Two implementations: a) Simple PC+4 predictor b) Predictor using BTB March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -5
Simple PC+4 predictor module mk. Pc. Plus 4(Addr. Pred); method Addr nap(Addr pc); return pc + 4; endmethod Action update(Redirect rd); endmethod endmodule March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -6
BTB predictor module mk. Btb(Addr. Pred); Reg. File#(Btb. Index, Addr) ppc. Arr <- mk. Reg. File. Full; Reg. File#(Btb. Index, Btb. Tag) entry. Pc. Arr <- mk. Reg. File. Full; Vector#(Btb. Entries, Reg#(Bool)) valid. Arr <- replicate. M(mk. Reg(False)); function Btb. Index get. Index(Addr pc)=truncate(pc>>2); function Btb. Tag get. Tag(Addr pc) = truncate. LSB(pc); method Addr nap(Addr pc); Btb. Index index = get. Index(pc); Btb. Tag tag = get. Tag(pc); if(valid. Arr[index] && tag == entry. Pc. Arr. sub(index)) return ppc. Arr. sub(index); else return (pc + 4); endmethod Action update(Redirect redirect); . . . endmodule March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -7
BTB predictor update method redirect input contains a pc, the correct next pc and whether the branch was taken or not (to avoid making entries for not-taken branches) method Action update(Redirect redirect); if(redirect. taken) begin let index = get. Index(redirect. pc); let tag = get. Tag(redirect. pc); valid. Arr[index] <= True; entry. Pc. Arr. upd(index, tag); ppc. Arr. upd(index, redirect. next. Pc); end else if(tag == entry. Pc. Arr. sub(index)) valid. Arr[index] <= False; endmethod March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -8
Multiple Predictors: BTB + Branch Direction Predictors mispred insts must be filtered Next Addr Pred tight • loop P C • Need • next PC immediately March 4, 2016 Br Dir Pred correct mispred Decode Reg Read • Instr type, PC relative targets available • Simple conditions, register targets available http: //csg. csail. mit. edu/6. 375 correct mispred Execute Write Back • Complex conditions available T 04 -9
RISC-V Processor SCE-MI Infrastructure March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -10
RISC-V Interface mk. Proc – BSV cpu. To. Host host. To. Cpu Host (Testbench) March 4, 2016 CSR PC Core i. Mem. Init i. Mem d. Mem. Init d. Mem http: //csg. csail. mit. edu/6. 375 T 04 -11
RISC-V Interface interface Proc; method Action host. To. Cpu(Addr startpc); method Action. Value#(Cpu. To. Host. Data) cpu. To. Host; interface Mem. Init i. Mem. Init; interface Mem. Init d. Mem. Init; endinterface typedef struct { Cpu. To. Host. Type c 2 h. Type; Bit#(16) data; } Cpu. To. Host. Data deriving(Bits, Eq); typedef enum { Exit. Code, Print. Char, Print. Int. Low, Print. Int. High } Cpu. To. Host. Type deriving(Bits, Eq); March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -12
RISC-V Interface: cpu. To. Host Write mtohost CSR: csrw mtohost, rs 1 n n rs 1[15: 0]: data w 32 -bit Integer needs two writes rs 1[17: 16]: c 2 h. Type w 0: Exit code w 1: Print character w 2: Print low 16 bits w 3: Print high 16 bits typedef struct { Cpu. To. Host. Type c 2 h. Type; Bit#(16) data; } Cpu. To. Host. Data deriving(Bits, Eq); March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -13
RISC-V Interface: Others host. To. Cpu n Tells the processor to start running from the given address i. Mem. Init/d. Mem. Init n n n March 4, 2016 Used to initialize i. Mem and d. Mem Can also be used to check when initialization is done Defined in Mem. Init. bsv http: //csg. csail. mit. edu/6. 375 T 04 -14
Sce. Mi Interface tb – C++ mk. Proc – BSV CSR PC Core i. Mem d. Mem March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -15
Load Program tb – C++ mk. Proc – BSV CSR PC Core add. riscv. vmh i. Mem d. Mem Bypass this step in simulation March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -16
Load Program tb – C++ mk. Proc – BSV CSR PC Core i. Mem d. Mem mem. vmh Simulation: load with mem. vmh (fixed file name) n March 4, 2016 Copy <test>. riscv. vmh to mem. vmh http: //csg. csail. mit. edu/6. 375 T 04 -17
Start Processor tb – C++ mk. Proc – BSV CSR Starting PC 0 x 200 PC Core i. Mem d. Mem March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -18
Print & Exit tb – C++ Get reg c 2 h. Type: 1, 2, 3: print 0: Exit Data == 0 PASSED Data != 0 FAILED March 4, 2016 mk. Proc – BSV CSR PC Core i. Mem d. Mem http: //csg. csail. mit. edu/6. 375 T 04 -19
Final Project March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -20
Overview Groups of 2 -3 students Each group assigned to a graduate mentor in our group Groups meet individually with Arvind, mentor and me Weekly reports due before the meeting n March 4, 2016 Email to 6. 375 -admin@mit. edu and mentor http: //csg. csail. mit. edu/6. 375 T 04 -21
Schedule March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -22
Project Considerations Design a complex digital system Choose an application that could benefit from hardware acceleration or FPGAs Application should be well understood n Find/implement reference software code Look at past year projects on the website March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -23
FPGA IPs and Resources Many Xilinx related IPs are available in the BSV library n n $BLUESPECDIR/BSVSource/Xilinx BRAMs, DRAM, Clock generators/buffers, LED controller, HDMI controller, LCD controller Can wrap Verilog libraries/IPs in BSV code using import. BVI n March 4, 2016 Tutorial: http: //wiki. bluespec. com/Home/Experienced. Users/Import-BVI http: //csg. csail. mit. edu/6. 375 T 04 -24
BRAMs on FPGAs Fast, small, on-chip distributed RAM on FPGA n n n 1 cycle access latency 36 Kbits x 1500 (approx) = ~6. 75 MB total Up to 2 ports • Port A • Port B • Request • BRAM • Resp March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -25
BRAMs in BSV Library 2 Ported BRAM server: mk. BRAM 2 Server() Large FIFOs: mk. Sized. BRAMFIFO() Large sync FIFOs: mk. Sync. BRAMFIFO() Primitive BRAM: mk. BRAMCore 2() import BRAM: : *; BRAM_Configure cfg = default. Value ; cfg. memory. Size = 1024*32 ; //define custom memory. Size //instantiate 32 K x 16 bits BRAM module BRAM 2 Port#(UInt#(15), Bit#(16)) bram <- mk. BRAM 2 Server (cfg) ; rule do. Write; bram. port. A. request. put( BRAMRequest{ write: True, response. On. Write: False, address: 15’h 01 datain: data } ); March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -26
DRAM on FPGA Large capacity (1 GB on VC 707) Longer access latency, especially random access BSV library at $BLUESPECDIR/BSVSource/ Xilinx/Xilinx. VC 707 DDR 3. bsv Misc/DDR 3. bsv n • DRAM • Off-chip • FPGA DRAM Controller IP Not officially in documentation Example code will be given as part of Lab 6 March 4, 2016 • DDR 3_Pins http: //csg. csail. mit. edu/6. 375 • BSV Wrapper • DDR 3_User T 04 -27
DRAM Request/Response 512 -bit wide user interface DDR Request: n n Write: write or read Byteen: byte enable mask. Which of the 8 -bit bytes in the 512 -bits will be written Address: DRAM address for 512 -bit words Data: data to be written DDR Response: n March 4, 2016 Bit#(512) read data http: //csg. csail. mit. edu/6. 375 T 04 -28
Indirect Memory Access Host CPU load/stores data from host DRAM to PCIe device (FPGA) n n Low bandwidth, consumes CPU cycles Used in Sce. Mi: ~50 MB/s • Host CPU Bus Host DRAM FPGA DRAM March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -29
Direct Memory Access (DMA) Host CPU sets up DMA engine performs data transfer n n High bandwidth, minimal CPU involved: 1 -4 GB/s Not supported by Sce. Mi • Host CPU Bus Host DRAM DMA Eng FPGA DRAM March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -30
Connectal A Sce. Mi Alternative Open source hardware/software codesign library n n Generates glue logic between software/hardware Supports DMA https: //github. com/cambridgehackers/c onnectal Guest lecture next Wed on this March 4, 2016 http: //csg. csail. mit. edu/6. 375 T 04 -31
- Com(2018) 375 final
- Riscv international ceo calista
- Risc v green card
- Isa in computer architecture
- Riscv
- Sifive
- Risc v green card
- Augmented reality final year projects
- Enemy symbols army
- Rm 375-tr-2008 iluminacion tabla
- Zavet 375
- 300 x 375 pikseli
- 1400/375
- Half wave plate jones matrix
- Ludlum 375
- Csc 375
- Umdearborn vpn
- Suku ke-4 barisan geometri adalah 375
- The project portfolio matrix
- Peopleware productive projects and teams
- Managing multiple projects objectives and deadlines
- Initiating and planning systems development projects
- Potential development projects can be identified by:
- Characteristics and types of projects
- Planning and implementing crm projects
- Initiating and planning systems development projects
- Project management and workflow for digitization projects
- Managing and leading software projects
- Why do irr and npv rank the two projects differently?
- Optical illusions science fair projects
- Intelligence advanced research projects activity
- Call center projects