Experiencing Micro Blaze Hardware and Software Zhangxi Tan
Experiencing Micro. Blaze Hardware and Software Zhangxi Tan Wei Xu Computer Science Division University of California Berkeley *RAMP Summer Retreat, June 2006 David Patterson
Outline Introduction and Background n Advantages of Micro. Blaze n Hardware Experiences with Multicore n Lesson Learned from Software Porting n Q&A n RAMP summer retreat, June 2006 2
Introduction and Background n Initial prototyping of Internet in a Box project ¨ IIAB – a RAMP cluster based distributed system testbed at O(1000) nodes ¨ Micro. Blaze as the first processor and basic building block n What have we done in Internet in a Box version 0? ¨A small cluster with Xilinx XUP boards n Virtex-II Pro XC 2 VP 30 -7, half size and same speed grade as XCVP 70 on BEE 2 ¨ 4 Micro. Blazes @ 100 MHz per FPGA with heavy workload! (stable) ¨ More detail and demo tomorrow n Experiences with Micro. Blaze RAMP summer retreat, June 2006 3
Micro. Blaze Advantage n Easy to use with EDK ¨ Linux/GCC n support (with limitations) High “performance” softcore processor ¨ Most of instructions can be completed with 1 cycles ¨ Shorter pipeline, higher working frequency (>100 MHz on Virtex-II ) n LEON 3 7 -stage pipeline, 5000 LUTs @ 90 MHz on Virtex-II ¨ FPGA n n optimized implementation Fast carry chain MUX, hardware multiplier RLOC placement constraints RAMP summer retreat, June 2006 4
Outline Introduction and Background n Advantages of Micro. Blaze n Hardware Experiences with Multicore n Lesson Learned from Software Porting n Q&A n RAMP summer retreat, June 2006 5
Poor quality of IP cores kills most of developing time. What else can I say here? n Most of IP cores are not multicore compatible (e. g. bus arbitration problem) A long bug list: opb_ddr, mch_opb_ddr, opb_ethernet ¨ Poorly written document make things worse ¨ n Open source/commercial IP core bugs VS open source software bugs More time to find the problem ¨ More difficult to fix (less update, small size of the community) ¨ RAMP summer retreat, June 2006 6
Scaling difficulty inside large FPGA (Tussle with softcore) n Timing issue becomes the second time killer! ¨ 100 MHz is the upper bound? n n n ¨ Take quite a while for 4 -core design working 6 -core design appears unstable under heavy load 16 -core/FPGA on BEE 2 might be ambitious! Shared resources (e. g. memory controller) become the critical components n n n Routing delay dominates ( 60%-70%) Floorplaning highly connected components is hard Too many fast carry chain style MUXes in IP cores ¨ n Be careful with your signal naming in RTL codes! ¨ ¨ One-level logic, so faster? – No! without RLOC constraints will make things even worse! OPB 0, OPB 1 will be treated as signals from the same bus – affect the register mapping Place and Route time is so long! O(hour) RAMP summer retreat, June 2006 7
n Timing summary ¨ When a new IP core is added, it’s not only a resource and functionality problem. ¨ Embedded physical information into RTL code is preferred n Can’t write the code without timing/placement constraints (argument to other high level synthesis tools) ¨ Advanced physical synthesis is preferred n EDK is easy to use, but not friendly with physical synthesis software (e. g. Synplify Premier, Precision Physical). ¨ ¨ ¨ n For Qo. R and full control, EDK is not the best choice ¨ ¨ n Tool compatibility issues RTL information is hided by non-standard netlist files (Xilinx NGC files) Can’t cross probe between RTL code and mapped design. Efforts spent on timing tuning exceed those on connecting signals saved by EDK What about RDL? An easy solution: ¨ Lower the frequency! RAMP summer retreat, June 2006 8
Architecture limitations of Micro. Blaze n No MMU support ¨ no protection among ¨ Can’t run full version n No double precision floating point ¨ no n processes Linux. full floating point libraries support in lib. C No atomic instructions ¨ hard to implement lock ¨ non-blocking FIFO instruction n problem. No cache coherent support RAMP summer retreat, June 2006 9
The abuse of BRAM n Most of BRAM are used for Cache ¨ Why not use external SRAM? high power consumption n high chip cost n unbearable place/route time n RAMP summer retreat, June 2006 10
Outline Introduction and Background n Advantages of Micro. Blaze n Hardware Experiences with Multicore n Lesson Learned from Software Porting n Q&A n RAMP summer retreat, June 2006 11
Different design goals Goal Embedded System Full version OSes Application support Dedicated to certain apps Generic, multiple apps Types of application Usually simple From simple to complex User model Single user Multiple user Functionality good enough The more the better Multitasking None, or cooperating processes Competing processes Resource efficiency Extremely important Less and less important Major functionality of OS A programming interface (an extended virtual machine) Both a programming interface and resource management Compatibility Not important Quite important Ease to debug Not really a goal. Debug usually requires special hardware. As easy as possible Programmers Embedded system developers (mostly people from hardware background) Generic computer programmers Changing software “firmware” Easy RAMP summer retreat, June 2006 12
The Missing 5%. . n No protection between processes ¨ n Nightmare for software debugging Lack of fork() ¨ vfork() does not have the same semantics ¨ pthread sometimes works at the cost of rewriting application. n the No shared library support ¨ Applications suffer from jumbo file size n ”simple” i 3 applications – 25 KB v. s. 1. 8 MB ¨ some applications will not run without shared n Ruby interpreter (libdl) RAMP summer retreat, June 2006 library 13
“Auto”config Makefile/Configuration files can not recognize Micro. Blaze target n Too many architecture dependent codes in existing applications n Running Java is hard n Reconfigurable hardware confuses common build tools n ¨ Some exception handlers are crucial (e. g. unaligned access) RAMP summer retreat, June 2006 14
50 MIPS vs 1000 MIPS n Many applications are designed to run on CPU over 1000 MIPS ¨ Porting them is not straight-forward n Talk to the real world n I 3 pings all fingers every “second” ¨ n How to dilute the “second”? Many places to change in the code ¨ Not done in this project ¨ Time dilation is our future work n Emulate machines over 1000 MIPS RAMP summer retreat, June 2006 15
None technical challenges Maturity level of tools n The software community for Micro. Blaze is too small n Research codes are even worse than general software n ¨ Portability ¨ Convention RAMP summer retreat, June 2006 16
Questions? RAMP summer retreat, June 2006 17
- Slides: 17