Building MultiProcessor FPGA Systems Handson Tutorial to Using
Building Multi-Processor FPGA Systems Hands-on Tutorial to Using FPGAs and Linux Chris Martin <cmartin@altera. com> Member Technical Staff Embedded Applications
Agenda Introduction Problem: How to Integrate Multi-Processor Subsystems Why… – Why would you do this? – Why use FPGAs? Lab 1: Getting Started - Booting Linux and Boot-strapping NIOS Lab 2: Inter-Processor Communication and Shared Peripherals Lab 3: Locking and Tetris Building Hardware: FPGA Hardware Tools & Build Flow Building/Debugging Software: Software Tools & Build Flow References Q&A – All through out. 2
The Problem – Integrating Multi-Processor Subsystems Given a system with multiple processor subsystems, these architecture decisions must be considered: Inter-processor communication Partitioning/sharing Peripherals (locking required) Bandwidth & Latency Requirements 3 Periph 1 Processor Subsystem 1 Periph 2 Periph 3 ? ? ? Periph 1 Processor Subsystem 2 Periph 3
Why Do We Need to Integrate Multi-Processor Subsystems? May have inherited processor subsystem from another development team or 3 rd party – Risk Mitigation by reducing change Fulfill Latency and Bandwidth Requirements – Real-time Considerations – If main processor not Real-Time enabled, can add a real-time processor subsystem Design partition / Sandboxing – Break the system into smaller subsystems to service task – Smaller task can be designed easily Leverage Software Resources 4 – Sometimes problem is resolved in less time by Processor/Software rather than Hardware design – Sequencers, State-machines
Why do we want to integrate with FPGA? (or rather, HOW can FPGAs help? ) Bandwidth & Latency can be tailored – Addresses Real-time aspects of System Solution – FPGA logic has flexible interconnect – Trade Data width with clock frequency with latency Experimentation – Many processor subsystems can be implemented – Allows you to experiment changing microprocessor subsystem hardware designs – Altera FPGA under-the-hood – However: Generic Linux interfaces used and can be applied in any Linux system. 5 Simple Multiprocessor System A Peripheral ARM Shared Peripheral Mailbox NIOS N Peripheral And, why is Altera involved with Embedded Linux…
Why is Altera Involved with Embedded Linux? Design Starts With Embedded Processor Without Embedded Processor 50% Source: Gartner September 2010 More than 50% of FPGA designs include an embedded processor, and growing. Many embedded designs using Linux Open-source re-use. – 6 Altera Linux Development Team actively contributes to Linux Kernel
So. CKit Board Architecture Overview n Lab focus - 7 UART DDR 3 LEDs Buttons
So. C/FPGA Hardware Architecture Overview DDR n ARM-to-FPGA Bridges - Data Width configurable n A 9 I$ A 9 D$ I$ D$ L 2 FPGA EMIF DMA ROM UART RAM SD/MMC - 42 K Logic Macros - Using no more than 14% AXI Bridge HPS 2 FPGA LWHPS 2 FPGA 32/64/128 32 AXI Bridge FPGA 2 HPS 32/64/128 SYS ID RAM FPGA Fabric “Soft Logic” 8 GPIO 32 NIOS
Lab 1: Getting Started Booting Linux and Boot-strapping NIOS Topics Covered: – – – Configuring FPGA from SD/MMC and U-Booting Linux on ARM Cortex-A 9 Configuring Device Tree Resetting and Booting NIOS Processor Building and compiling simple Linux Application Key Example Code Provided: – C code for downloading NIOS code and resetting NIOS from ARM – Using U-boot to set ARM peripheral security bits Full step-by-step instructions are included in lab manual. 9
Lab 1: Hardware Design Overview NIOS Subsystem – 1 NIOS Gen 2 processor – 64 k combined instruction/data RAM (On-Chip RAM) – GPIO peripheral Subsystem 1 SD/MMC EMIF Cortex-A 9 UART ARM Subsystem – – 2 Cortex-A 9 (only using 1) DDR 3 External Memory SD/MMC Peripheral UART Peripheral RAM NIOS 0 GPIO Subsystem 2 Shared Peripherals 10 Dedicated Peripherals
Lab 1: Programmer View - Processor Address Maps NIOS 11 ARM Cortex-A 9 Address Base Peripheral 0 x. FFC 0_2000 ARM UART 0 x. FFC 0_2000 UART 0 x 0003_0000 GPIO (LEDs) 0 x. C 003_0000 GPIO (LEDs) 0 x 0002_0000 System ID 0 x. C 002_0000 System ID 0 x 0000_0000 On-chip RAM 0 x. C 000_0000 On-chip RAM
Lab 1: Peripheral Registers 12 Peripheral Address Offset Access Bit Definitions Sys ID 0 x 0 RO [31: 0] – System ID. Lab Default = 0 x 00001 ab 1 GPIO 0 x 0 R/W [31: 0] – Drive GPIO output. Lab Uses for LED control, push button status and NIOS processor resets (from ARM). [3: 0] - LED 0 -3 Control. ‘ 0’ = LED off. ‘ 1’ = LED on [4] – NIOS 0 Reset [5] – NIOS 1 Reset [1: 0] – Push Button Status UART 0 x 14 RO Line Status Register [5] – TX FIFO Empty [0] – Data Ready (RX FIFO not-Empty) UART 0 x 30 R/W Shadow Receive Buffer Register [7: 0] – RX character from serial input UART 0 x 34 R/W Shadow Transmit Register [7: 0] – TX character to serial output
Lab 1: Processor Resets Via Standard Linux GPIO int main(int argc, char** argv) Interface { int fd, gpio=168; char buf[MAX_BUF]; n n /* Export: echo ### > /sys/class/gpio/export */ fd = open("/sys/class/gpio/export", O_WRONLY); sprintf(buf, "%d", gpio); write(fd, buf, strlen(buf)); close(fd); NIOS resets connected to GPIO /* Set direction to Out: */ /* echo "out“ > /sys/class/gpio###/direction */ sprintf(buf, "/sys/class/gpio%d/direction", gpio); fd = open(buf, O_WRONLY); write(fd, "out", 3); /* write(fd, "in", 2); */ close(fd); GPIO driver uses /sys/class/gpio interface /* Set GPIO Output High or Low */ /* echo 1 > /sys/class/gpio###/value */ sprintf(buf, "/sys/class/gpio%d/value", gpio); fd = open(buf, O_WRONLY); write(fd, "1", 1); /* write(fd, "0", 1); */ close(fd); 13 /* Unexport: echo ### > /sys/class/gpio/unexport */ fd = open("/sys/class/gpio/unexport", O_WRONLY); sprintf(buf, "%d", gpio); write(fd, buf, strlen(buf)); close(fd); }
Lab 1: Loading External Processor Code Via Standard Linux shared memory (mmap) n n NIOS RAM address accessed via mmap() Can be shared with other processes R/W during load Read-only protection after load /* Map Physical address of NIOS RAM to virtual address segment with Read/Write Access */ fd = open("/dev/mem", O_RDWR); load_address = mmap(NULL, 0 x 10000, PROT_READ|PROT_WRITE, MAP_SHARED , fd, 0 xc 0000000); /* Set size of code to load */ load_size = sizeof(nios_code)/sizeof(nios_code[0]); /* Load NIOS Code */ for(i=0; i < load_size ; i++) { *(load_address+i) = nios_code[i ]; } /* Set load address segment to Read-Only */ mprotect(load_address, 0 x 10000, PROT_READ); /* Un-map load address segment */ munmap(load_address, 0 x 10000); 14
Lab 2: Mailboxes NIOS/ARM Communication Topics Covered: – Altera Mailbox Hardware IP Key Example Code Provided: – C code for sending/receiving messages via hardware Mailbox IP NIOS & ARM C Code – Simple message protocol – Simple Command parser Full step-by-step instructions are included in lab manual. – User to add second NIOS processor mailbox control. 15
Lab 2: Hardware Design Overview NIOS 0 & 1 Subsystems – NIOS Gen 2 processor – 64 k combined instruction/data RAM – GPIO (4 out, LED) – GPIO (2 in, Buttons) – Mailbox ARM Subsystem – – 2 Cortex-A 9 (only using 1) DDR 3 External Memory SD/MMC Peripheral UART Peripheral Subsystem 1 SD/MMC Cortex-A 9 UART GPIO MBox RAM NIOS 0 NIOS 1 GPIO Subsystem 2 Shared Peripherals 16 EMIF Subsystem 3 Dedicated Peripherals
Lab 2: Programmer View - Processor Address Maps NIOS 0 & 1 17 ARM Cortex-A 9 Address Base Peripheral 0 x. FFC 0_2000 ARM UART 0 x. FFC 0_2000 UART 0 x 0007_8000 Mailbox (from ARM) 0 x 0007_8000 Mailbox (to NIOS 1) 0 x 0007_0000 Mailbox (to ARM) 0 x 0007_0000 Mailbox (from NIOS 1) 0 x 0005_0000 GPIO (In Buttons) 0 x 0006_8000 Mailbox (to NIOS 0) 0 x 0003_0000 GPIO (Out LEDs) 0 x 0006_0000 Mailbox (from NIOS 0) 0 x 0002_0000 System ID 0 x. C 003_0000 GPIO (LEDs) 0 x 0000_0000 On-chip RAM 0 x. C 002_0000 System ID 0 x. C 001_0000 NIOS 1 RAM 0 x. C 000_0000 NIOS 0 RAM
Lab 2: Additional Peripheral (Mailbox) Registers Peripheral Address Offset Access Bit Definitions Mailbox 0 x 0 R/W [31: 0] – RX/TX Data Mailbox 0 x 8 R/W [1] – RX Message Queue Has Data [0] – TX Message Queue Empty 18
LAB 2: Designing a Simple Message Protocol n Design Decisions: - Short Length: A single 32 -bit word - Human Readable - Message transactions are closed- loop. Includes ACK/NACK n Format: - Message Length: Four Bytes - First Byte is ASCII character n Byte 0 Byte 1 Byte 2 Byte 3 ‘L’ ‘ 0’ ‘ ’ ‘A’ ‘ 0’ ‘ ’ Message Types: - “G 00”: Give Access to UART (Push) - “A 1 A”: ACK - “N 1 A”: NACK denoting message type. - Second Byte is ASCII char from 0 n Can be Extended: -9 denoting processor number. - “L 00”: LED Set/Ready - Third Byte is ASCII char from 0 -9 - “B 00”: Button Pressed denoting message data, except for - “R 00”: Request UART Access ACK/NACK. (Pull) - Fourth Byte is always null “G 00” character ‘ ’ to terminate string (human readable). Cortex-A 9 NIOS 0 19 “A 0 A” “N 0 N”
Lab 2: Inter-Processor Communication with Mailbox HW Via Standard Linux Shared Memory (mmap) n n n 20 Wait for Mailbox Hardware message empty flag Send message (4 bytes) Disable ARM/Linux Access to UART Wait for RX message received flag Re-enable ARM/Linux UART Access /* Map Physical address of Mailbox to virtual address segment with Read/Write Access */ fd = open("/dev/mem", O_RDWR); mbox 0_address = mmap(NULL, 0 x 10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0 xff 260000 ); <snip> /* Waiting for Message Queue to empty */ while((*(volatile int*)(mbox 0_address+0 x 2000+2 ) & 1) != 0 ) {} /* Send Granted/Go message to NIOS */ send_message = "G 00"; *(mbox 0_address+0 x 2000) = *(int *)send_message ; /* Disable ARM/Linux Access to UART (be careful here )*/ config. c_cflag &= ~CREAD; if(tcsetattr(fd, TCSAFLUSH, &config) < 0) { } /* Wait for Received Message */ while((*(volatile int*)(mbox 0_address+2) & 2) == 0 ) {} /* Re-enable UART Access */ config. c_cflag |= CREAD; tcsetattr(fd, TCSAFLUSH, &config); /* Read Received Message */ printf(" - Message Received. DATA = '%s'. n", (char*)(mbox 0_address));
Lab 3: Putting It All Together – Tetris! Combining Locking and Communication Topics Covered: – Linux Mutex Key Example Code Provided: – C code showcasing using Mutexes for locking shared peripheral access – C code for multiple processor subsystem bringup and shutdown Full step-by-step instructions are included in lab manual. – User to add code for second NIOS processor bringup, shutdown and locking/control. 21
Lab 3: Hardware Design Overview (Same As Lab 2) NIOS 0 & 1 Subsystems – NIOS Gen 2 processor – 64 k combined instruction/data RAM – GPIO (4 out, LED) – GPIO (2 in, Buttons) – Mailbox ARM Subsystem – – 2 Cortex-A 9 (only using 1) DDR 3 External Memory SD/MMC Peripheral UART Peripheral Subsystem 1 SD/MMC Cortex-A 9 UART GPIO MBox RAM NIOS 0 NIOS 1 GPIO Subsystem 2 Shared Peripherals 22 EMIF Subsystem 3 Dedicated Peripherals
Lab 3: Programmer View - Processor Address Maps NIOS 0 & 1 23 ARM Cortex-A 9 Address Base Peripheral 0 x. FFC 0_2000 ARM UART 0 x. FFC 0_2000 UART 0 x 0007_8000 Mailbox (from ARM) 0 x 0007_8000 Mailbox (to NIOS 1) 0 x 0007_0000 Mailbox (to ARM) 0 x 0007_0000 Mailbox (from NIOS 1) 0 x 0005_0000 GPIO (In Buttons) 0 x 0006_8000 Mailbox (to NIOS 0) 0 x 0003_0000 GPIO (Out LEDs) 0 x 0006_0000 Mailbox (from NIOS 0) 0 x 0002_0000 System ID 0 x. C 003_0000 GPIO (LEDs) 0 x 0000_0000 On-chip RAM 0 x. C 002_0000 System ID 0 x. C 001_0000 NIOS 1 RAM 0 x. C 000_0000 NIOS 0 RAM
Available Linux Locking/Synchronization Mechanisms Need to share peripherals – Choose a Locking Mechanism Available in Linux – – – Mutex <- Chosen for this Lab Completions Spinlocks Semaphores Read-copy-update (decent for multiple readers, single writer) – Seqlocks (decent for multiple readers, single writer) Available for Linux – MCAPI - openmcapi. org 24
Tetris Message Protocol – Extended from Lab 2 NIOS Control Flow: “B 00” NIOS 0 – Wait for button press – Send Button press message “A 0 A” – Wait for ACK (Free to write to LED GPIO) – Write to LED GPIO “L 00” – Send LED ready msg – Wait for ACK ARM Control Flow: – Wait for button press message “A 0 A” “B 10” NIOS 1 – Lock LED GPIO Peripheral – Send ACK (Free to write to LED GPIO) “A 1 A” – Wait for LED ready msg – Send ACK “L 10” – Read LED value – Release Lock/Mutex 25 “A 1 A” Cortex-A 9
Lab 3: Locking Hardware Peripheral Access Via Linux Mutex pthread_mutex_t lock; <snip – Initialize/create/start> /* Initialize Mutex */ err = pthread_mutex_init (&lock, NULL); n n In this example, LED GPIO is accessed by multiple processors Wrap LED critical section (LED status reads) with: - pthread_mutex_lock() - pthread_mutex_unlock() n Also need Mutex init/destroy: - pthread_mutex_init() - pthread_mutex_destroy() /* Create 2 Threads */ i=0; while(i < 1) { err = pthread_create(&(tid[i]), NULL, &nios_buttons_get, &(nios_num[i])); i++; } <snip – Critical Section> pthread_mutex_lock(&lock); /* Critical Section */ pthread_mutex_unlock(&lock); <snip Stop/Destroy> /* Wait for threads to complete */ pthread_join(tid[0], NULL); pthread_join(tid[1], NULL); /* Destroy/remove lock */ pthread_mutex_destroy(&lock); 26
References
Altera References System Design Tutorials: – http: //www. alterawiki. com/wiki/Designing_with_AXI_for_Altera_So. C_ARM_Devices_Workshop_Lab__Creating_Your_AXI 3_Component – Designing_with_AXI_for_Altera_So. C_ARM_Devices_Workshop_Lab – Simple_HPS_to_FPGA_Comunication_for_Altera_So. C_ARM_Devices_Workshop – http: //www. alterawiki. com/wiki/Simple_HPS_to_FPGA_Comunication_for_Altera_So. C_ARM_Devices_Workshop_-_LAB 2 Multiprocessor NIOS-only Tutorial: – http: //www. altera. com/literature/tt/tt_nios 2_multiprocessor_tutorial. pdf Quartus Handbook: – https: //www. altera. com/en_US/pdfs/literature/hb/qts/quartusii_handbook. pdf Qsys: – System Design with Qsys (PDF) section in the Handbook – Qsys Tutorial: Step-by-step procedures and design example files to create and verify a system in Qsys – Qsys 2 -day instructor-led class: System Integration with Qsys – Qsys webcasts and demonstration videos So. C Embedded Design Suite User Guide: – https: //www. altera. com/en_US/pdfs/literature/ug/ug_soc_eds. pdf
Related Articles Performance Analysis of Inter-Processor Communication Methods – http: //www. design-reuse. com/articles/24254/inter-processor-communicationmulti-core-processors-reconfigurable-device. html Communicating Efficiently between Qorl. Q Cores in Medical Applications – https: //cache. freescale. com/files/32 bit/doc/brochure/PWRARBYNDBITSCE. p df Linux Inter-Process Communication: – http: //www. tldp. org/LDP/tlk/ipc. html Linux locking mechanisms (from ARM): – http: //infocenter. arm. com/help/index. jsp? topic=/com. arm. doc. dai 0425/ch 04 s 0 7 s 03. html Open. MCAPI: – https: //bitbucket. org/hollisb/openmcapi/wiki/Home Mutex Examples: – http: //www. thegeekstuff. com/2012/05/c-mutex-examples/ 29
Thank You n Full Tutorial Resources Online - Project Wiki Page: http: //rocketboards. org/foswiki/Projects/Building. Multi. Proce ssor. Systems n Includes: - Source code Hardware source Hardware Quartus Projects Software Eclipse Projects
BACKUP SLIDES
Post-Lab 1 Additional Topics Hardware Design Flow and FPGA Boot with U-boot and SD/MMC
Building Hardware: Qsys (Hardware System Design Tool) User Interfaces Exported In/out of system Connections between cores 33
Hardware and Software Work Flow Overview Preloader & U-Boot Quartus & Qsys Eclipse DS-5 & Debug Tools Device Tree RBF Inputs: – Hardware Design (Qsys or RTL or Both) Outputs (to load on boot media): – Preloader and U-boot Images – FPGA Programmation File: Raw Binary Format (RBF) – Device Tree Blob 34
SDCARD Layout Partition 1: FAT – – – Uboot scripts FPGA HW Designs (RBF) Device Tree Blobs z. Image Lab material Partition 2: EXT 3 – Rootfs Partition 3: Raw – Uboot/preloader Partition 4: EXT 3 – Kernel src 35
Updating SD Cards File Update Procedure z. Image Mount DOS SD card partition 1 and replace file with new one: $ sudo mkdir sdcard $ sudo mount /dev/sdx 1 sdcard/ $ sudo cp <file_name> sdcard/ $ sudo umount sdcard soc_system. rbf soc_system. dtb u-boot. scr preloader-mkpimage. bin $ sudo dd if=preloader-mkpimage. bin of=/dev/sdx 3 bs=64 k seek=0 u-boot-socfpga_cyclone 5. img $ sudo dd if=u-boot-socfpga_cyclone 5. img of=/dev/sdx 3 bs=64 k seek=4 root filesystem $ sudo dd if=altera-gsrd-imagesocfpga_cyclone 5. ext 3 of=/dev/sdx 2 More info found on Rocketboards. org – http: //www. rocketboards. org/foswiki/Documentation/GSRD 141 Sd. Card Automated Python Script to build SD Cards: – make_sdimage. py 36
Post-Lab 2 Additional Topic Using Eclipse to Debug: NIOS Software Build Tools
Altera NIOS Software Design and Debug Tools Nios II SBT for Eclipse key features: – New project wizards and software templates – Compiler for C and C++ (GNU) – Source navigator, editor, and debugger – Eclipse project-based tools – Download code to hardware 38
Key Multi-Processor System Design Points Startup/Shutdown – Processor – Peripheral – Covered in Lab 1. Communication between processors – – What is the physical link? What is the protocol & messaging method? Message Bandwidth & Latency Covered in Lab 2 Partitioning peripherals – – 39 Declare dedicated peripherals – only connected/controlled by one processor Declare shared peripherals – Connected/controlled by multiple processors Decide Upon Locking Mechanism Covered in Lab 3
Post Lab 3 Additional Topic Altera So. C Embedded Design Suite
Altera Software Development Tools Eclipse – For ARM Cortex-A 9 (ARM Development Studio 5 – Altera Edition) – For NIOS Pre-loader/U-Boot Generator Device Tree Generator Bare-metal Libraries Compilers – GCC (for ARM and NIOS) – ARMCC (for ARM with license) Linux Specific – Kernel Sources – Yocto & Angstrom recipes: http: //rocketboards. org/foswiki/Documentation/Angstrom. On. So. CFPGA_1 – Buildroot: http: //rocketboards. org/foswiki/Documentation/Buildroot. For. So. CFPGA 41
System Development Flow FPGA Design Flow Hardware Development Software Design Flow Software Development • Quartus II design software • Qsys system integration tool • Standard RTL flow • Altera and partner IP Design • Model. Sim, VCS, NCSim, etc. • AMBA-AXI and Avalon bus functional models (BFMs) Simulate • Signal. Tap™ II logic analyzer • System Console Debug Release • Quartus II Programmer • In-system Update 42 • Eclipse • GNU toolchain • OS/BSP: Linux, Vx. Works • Hardware Libraries • Design Examples • GDB, Lauterbach, Eclipse • Flash Programmer
Inside the Golden System Reference Design Complete system example design with Linux software support Target Boards: – Altera So. C Development Kits – Arrow So. C Development Kits – Macnica So. C Development Kits Hardware Design: – Simple custom logic design in FPGA – All source code and Quartus II / Qsys design files for reference Software Design: – Includes Linux Kernel and Application Source code – Includes all compiled binaries 43
---Topics – Back Up--Introductions: Altera and So. C FPGAs Development Tools – How to Build Hardware: FPGA Hardware Tools & Build Flow – How to Build Software: Software Tools & Build Flow – How to Debug all-of-the-above: Debug Tools Key Multi-processor System Design Points Hardware design – Shared peripherals – Available Hardware IP Software design – Message Protocols – Linux tools/mechanism available today 44
Quartus – Hardware Development Tool
Quartus II User Interface n Quartus II main window provides a high level of visibility to each stage of the design flow - Project navigator provides direct visual access to most of the key project information - Tasks window allows you to use the tools and features of the Quartus II software and monitor their progress from a flow-based layout - Tool View window shows various tools and design files - Messages window outputs messages from each process of the run 46 Project Navigator Tool View window Tasks window Messages window
Typical Hardware Design Flow Project definition Project creation Design entry/RTL coding and early pin planning Design creation Functional verification Synthesis (mapping) • Verify design behavior Functional verification Logic Memory I/O Design compilation • Translate design into device-specific primitives • Optimization to meet required area and performance constraints Placement and routing (fitting) • Place design in specific device resources with reference to area and performance constraints • Connect resources with routing lines Timing analysis Functional verification • Verify design will work in target technology • Behavioral or structural description of design • Early pin planning allows board development in parallel Functional verification • Verify performance specifications were met • Static timing analysis PC board simulation and test In-system debug 47 • Simulate board design • Program and test device on board • On-chip tools for debugging
Quartus II Feature Overview Fully integrated development tool – Multiple design entry methods Includes intellectual property- (IP-) based system design – Up-front I/O assignment and validation Enables printed circuit board (PCB) layout early in the design process Project definition Project creation Design creation – Incremental compilation Reduces design compilation and improves timing closure – Logic synthesis Includes comprehensive integrated synthesis solution Advanced integration with third-party EDA synthesis software – Timing-driven placement and routing – Physical synthesis Improves performance without user intervention – Verification solution Time. Quest timing analyzer Power. Play power analysis and optimization Functional simulation – On-chip debug and verification suite 48 Functional verification Memory Logic I/O Design compilation Functional verification In-system debug
Quartus II Feature Overview (1/2) Feature Project creation Design entry Quartus II Software § New project wizard § § HDL editor Schematic editor State machine editor Mega. Wizard™ Plug-In Manager – Customization and generation of IP § Qsys system integration tool Design constraint assignments § Assignment editor § Pin planner § Synopsys Design Constraint (SDC) editor Synthesis § Quartus II Integrated Synthesis (QIS) § Third-party EDA synthesis § Design assistant Fitting and placing design into FPGA to meet user requirements § Fitter (including physical synthesis) Design analysis and debug § Netlist viewers § Advisors Power analysis § Power. Play power analyzer 49
Quartus II Feature Overview (2/2) Feature Quartus II Software Static timing analysis on post-fitted design § Time. Quest timing analyzer Viewing and editing design placement § Chip Planner Functional verification § Model. Sim®-Altera edition § Third-party EDA simulation tools Generation of device programming file § Assembler On-chip debug and verification § § § § Technique to optimize design and improve productivity § Quartus II incremental compilation § Physical synthesis optimization § Design Space Explorer (DSE) 50 Signal. Tap. TM II (embedded logic analyzer) In-system memory content editor Logic analyzer interface editor In-system sources and probes editor Signal. Probe pins Transceiver Toolkit External memory interface toolkit
Quartus II Subscription Edition vs. Web Edition Subscription Edition Device supported Software features: § Incremental compilation and team-based design § SSN Analyzer § Transceiver Toolkit § MAX series devices: All (Excluding MAX 7000 / 3000) § Cyclone III/IV/V FPGAs: All § Arria II/V FPGAs: All § Stratix III, IV, V FPGAs: All § Cyclone V So. Cs: All Web Edition § MAX series devices: All (Excluding MAX 7000 / 3000) § Cyclone V FPGAs: All (Excluding 5 CEA 9, 5 CGXC 9, and 5 CGTD 9) § Cyclone III/IV FPGAs: All § Arria II GX FPGA: EP 2 AGX 45 § Cyclone V So. Cs: All Yes No § Signal. Tap II, Signal. Probe Yes If Talk. Back feature is enabled § Multi-processor support Yes If Talk. Back feature is enabled Yes § No license required for Open. Core Plus hardware evaluation § License fee required for production use Windows 32/64 -bit Linux 32/64 -bit Perpetual (continues to work after expiration) No license required except for IP core $ Free IP Base Suite Mega. Core® functions Platform support License and maintenance terms 51 Price
How to Get Started Using Quartus II Software Download Quartus II software today and start designing with Altera programmable logic devices Quartus II Handbook - http: //www. altera. com/literature/lit-qts. jsp – Guides you through the programmable logic design cycle from design to verification – Also covers third-party EDA vendor tool interfaces Online demonstrations - http: //www. altera. com/quartusdemos – Easiest way to learn about the latest Quartus II software features and design flows Training classes - https: //mysupport. altera. com/etraining – Offers online training classes and live presentation coupled with hands-on exercises to learn about Quartus II features and design flows Agenda 52
Qsys – System Integration Platform
Qsys System Integration Platform High-Performance Interconnect Design Reuse Hierarchy Based on Network-on-a-Chip (No. C) Architecture ® Real-Time System Debug AMBA AXI, APB, AHB ® n Qsys is Altera’s design environment for - 54 Design System Add to Library Automated Testbench Generation Industry-Standard Interfaces Avalon® Interfaces Package as IP Deployment of IP, with hierarchal support Development platform for Altera custom solutions Design platform for customers to quickly create system designs
Qsys User Interfaces Exported for Hierarchy Toolbar Improved Validation Display 55
Qsys Benefits Raises the level of design abstraction – System-level design and system visualization Simplifies complex hierarchal system development – Automated interconnect generation Provides a standard platform – IP integration, custom IP authoring, IP verification Enables design re-use Reduces time to market – System-level design reduces development time – Facilitates verification Qsys improves productivity 56
Network-on-Chip Architecture Transaction Layer § Converts transactions to command packets and responses packets to responses Avalon-MM AXI-MM 57 Transport Layer § Transfers packets to destination Transaction Layer § Converts command packets to transactions and responses to response packets Avalon-MM AXI-MM Avalon-ST Master Interface Master Network Interface Avalon ST Network (Command) Slave Network Interface Slave Interface Master Network Interface Avalon ST Network (Response) Slave Network Interface Slave Interface
Benefits of Network-On-Chip Approach See white paper: Applying the Benefits of No. C Architecture to FPGA System Design Independent implementation of transaction/transport layers – Different transport layer network topologies can be implemented without transaction layer modification e. g. High performance components on a wide high-frequency crossbar network Supports standard interface interoperability – Mix and match interface types on transaction layer without transport layer modification Scalability – Segment network into sub-networks using Bridges Clock crossing logic 58
Industry-Standard Interfaces Developer Standard Interface Protocol Avalon® Interfaces ® AMBA® AXI, AMBA APB, and AMBA AHB Qsys supports mixing of different interfaces 59
Target Qsys Applications Qsys can be used in almost every FPGA design Designs fall into two categories – Control plane Memory mapped Reading and writing to control and status registers – Data plane Streaming Data switching (muxing, demuxing), aggregation, bridges
“Packets………I care about Latency!” Qsys packet format is wide – Packet format contains a complete transaction in a single clock cycle – Supports: Writes with 0 cycles of latency Reads with a round-trip latency of 1 cycle – You can control latency via Qsys configuration Separate command response network – Increases concurrency Command traffic and Response traffic don’t compete for resources 61
Qsys: Wide Range of Compliant IP Wide range of plug-and-play intellectual property (IP): – Interface protocol IP E. g. PCIe, Ethernet 10/1000 Mbps (Triple. Speed Ethernet), Interlaken, JTAG, UART, SPI – External memory interface IP E. g. DDR/DDR 2/DDR 3 – Video and imaging processing (VIP) IP E. g. VIP Suite including scaler, switch, deinterlacer, and alpha blending mixer – Embedded processor IP E. g. Hardened ARM processor system, Nios II processor – Verification IP E. g. Avalon-MM/-ST, AXI 4, APB >100 Qsys compliant IP available 62
Qsys as a Platform for System Integration Library of Available IP n n n n Connect IP and Systems Interface protocols Memory DSP Embedded Bridges PLL Custom systems Accelerate Development IP 1 Custom 1 IP 2 IP 3 Custom 2 HDL Simplify Integration Automate Error-Prone Integration Tasks 63
Additional Resources Watch online demos (3 -5 min) www. altera. com/qsys Complete the Qsys tutorial (2 -3 hrs) www. altera. com/qsys Watch free webcasts (10 -15 mins) www. altera. com/qsys Sign up for Qsys training www. altera. com/training 64
In-system Verification
Debug Challenges Accessing and viewing internal signals Not enough pins to use as test points Capabilities in creating trigger conditions that correctly capture data Verification of standard or proprietary protocol interfaces Overall design process bottleneck Debug Can Be Costly 66
On-chip Debug Access and view internal signals Store captured data in FPGA embedded memory Use JTAG interface as debug ports Incrementally add internal signals to view Reduce Debug Cycles by Using On-chip Debug Tools 67
On-chip Debug Technology Debug tools communicate with the FPGA via standard JTAG interface Multiple debug functions can share the JTAG interface simultaneously – Altera’s system-level debugging (SLD) hub technology makes this possible – All Altera tools and some third-party tools support the SLD hub JTAG interface FPGA Node 1 Download Cable 68 JTAG Tap Controller SLD Hub User's Design (Core Logic) Node 2 Node N-1
On-chip Debug Tools in Quartus II Software Signal. Tap II logic analyzer – Captures and displays hardware events, fast turnaround times – Incrementally creates trigger conditions and adds signals to view – Uses captured data stored in on-chip RAM and JTAG interface for communication In-system memory content editor – Displays content of on-chip memory – Enables modification of memory content in a running system External logic analyzer interface – Uses external logic analyzer to view internal signals – Dynamically switches internal signals to output In-system sources and probes – Stimulate and monitor internal signals without using on-chip RAM Exception: Signal. Probe incremental routing feature does not use JTAG interface (i. e. SLD hub technology) – Quickly routes an internal node to a pin for observation 69
Signal. Tap II Logic Analyzer Provides the most advanced triggering capabilities available in an FPGA-embedded logic analyzer Proven to be invaluable in the lab – Captures bugs that would take weeks of simulation to uncover Has broad customer adoption Features and benefits – An embedded logic analyzer Uses available internal memory – Probes state of internal signals without using external equipment or extra I/O pins – Incremental compilation support Fast turnaround time when adding signals to view – Advanced triggering for capturing difficult events/transactions – Power-up trigger support Debug the initialization code – Megafunction support Optionally, instantiate in HDL 70
In-system Memory Content Editor Enables FPGA memory content and design constants to be updated insystem, via JTAG interface, without recompiling a design or reconfiguring the rest of the FPGA – Fault injection into system – Update memory while system is running – Change value of coefficients in DSP applications – Easily perform “what if? ” type experiments in-system in just seconds Supports MIF and HEX formats for data interchange Megafunctions supported – LPM_CONSTANT, LPM_ROM, LPM_RAM_DQ, ALTSYNCRAM (ROM and single-port RAM mode) Enable memory content editor 71
In-system Memory Content Editor Under Tools menu In-system Memory Content Editor 72
Altera So. C Embedded Design Suite
Included in So. C Embedded Design Suite (EDS) Development Studio 5 Altera Edition – Awesome debugger, especially when combined with USB Blaster II Altera So. C FPGA System Trace Macrocells – Application development environment – Streamline system analyzer Hardware Libraries GNU-based bare-metal (EABI) compiler tools U-Boot Root file system to jump start software development Pre-built Linux kernel – http: //www. rocketboards. org for source trees and community access 74
System Development Flow FPGA Design Flow Hardware Development Software Design Flow Software Development • Quartus II design software • Qsys system integration tool • Standard RTL flow • Altera and partner IP Design • Model. Sim, VCS, NCSim, etc. • AMBA-AXI and Avalon bus functional models (BFMs) Simulate • Signal. Tap™ II logic analyzer • System Console Debug Release • Quartus II Programmer • In-system Update 75 • ARM Development Studio 5 • GNU toolchain • OS/BSP: Linux, Vx. Works • Hardware Libraries • Design Examples • GNU, Lauterbach, DS 5 • Flash Programmer
Altera So. C Embedded Design Suite FPGA Design Flow Software Design Flow Hardware Development • Quartus II design software • Qsys system integration tool • Standard RTL flow • Altera and partner IP Design • Model. Sim, VCS, NCSim, etc. • AMBA-AXI and Avalon bus functional models (BFMs) Simulate • Signal. Tap™ II logic analyzer • System Console Debug • Quartus II Programmer • In-system Update 76 Software Development HW/SW Handoff Design Simulate • ARM Development Studio 5 • GNU toolchain • OS/BSP: Linux, Vx. Works • Hardware Libraries • Design Examples • Virtual Target Software Development Release FPGA-Adaptive Debugging Debug Release • GNU, Lauterbach, DS 5 • Flash Programmer
Altera So. C Embedded Design Suite Comprehensive Suite SW Dev Tools Hardware-to. Software Handoff Hardware / software handoff tools Linux application development – Yocto Linux build environment – Pre-built binaries for Linux / U-Boot – Work in conjunction with the Community Portal Firmware Development Linux Application Development Bare-metal application development – So. C Hardware Libraries – Bare-metal compiler tools FPGA-adaptive debugging – ARM DS-5 Altera Edition Toolkit Design examples 77 FPGAAdaptive Debugging ü Free Web Edition ü Subscription Edition ü Free 30 -day Eval
Hardware-to-Software Handoff Hardware Qsys system info, SDRAM calibration files, ID / timestamp, HPS IOCSR data system. iswinfo Software 78 system. sopcinfo Preloader Generator Device Tree Generator . c &. h source files Linux Device Tree
Hardware / Software Handoff Tools n n n 79 Allow hardware and software teams to work independently and follow their familiar design flows Take Altera Quartus® II / Qsys output files and generate handoff files for the software design flow Device Tree standard specifies hardware connectivity so that Linux kernel can boot up correctly
Linux Application Development Yocto build support for Linux – Yocto standard enables open, versatile, and cost-effective embedded software development – Allows a smooth transition to commercial Linux distributions Pre-built Linux kernel, U-Boot, and root file system to jump start software development – Link to community portal for source trees and community access 80
Bare-metal Application Development Hardware Libraries – Software interface to all system registers – Functions to configure some basic system operations (e. g. clock speed settings, cache settings, FPGA configuration, etc. ) – Support board bring-up and diagnostics development – Can be used by bare-metal application, device drivers, or RTOS GNU-based bare-metal (EABI) compiler tools 81 Application Operating System Baremetal App BSP Hardware BMAL HAL PAL Libraries So. C FPGA
Golden System Reference Design Complete system design with Linux software support – Simple custom logic design in FPGA – All source code and Quartus II / Qsys design files for reference – Include all compiled binaries- example can run on an Altera So. C Development Kit to jumpstart development 82
DS-5 Altera Edition- One Tool, Three Usages 1 • JTAG-Based Debugging • Board Bring-up • OS porting, Drivers Dev, 2 • Kernel Debug • Application Debugging 83 • Linux User Space Code • RTOS App Code • System Integration • System Debug 3 • FPGA-Adaptive Debugging
One Device, Two Debugging Tools? ARM® DS-5™ Toolkit Altera Quartus™ II Software JTAG DSTREAM™ n n 84 Dedicated JTAG connection Visualize & control CPU subsystem JTAG n n Dedicated JTAG connection Visualize & control FPGA
One Device, Two Debugging Tools? ARM® DS-5™ Toolkit Altera Quartus™ II Software g n i g g u b De Barrier ualize is v o t le b a tool/c le g PGA in s F o d n N a n U P C both and control domains A to G P F d n a or CPU f y a w o te N la e JTAG r n r o c d n a r DSTREAM cross trigge d software events rdware an a h dress d JTAG a Dedicated JTAG connection n a c r e g d” debug e ix f “ o Visualize & control CPU N n n Dedicated JTAG connection f o s d subsystem e e n e a h t dw nre. Visualize & control FPGA r a h A G P F “changing” ™ n n 85
Industry First: FPGA-Adaptive Debugging Altera USB-Blaster™II Connection ARM® Development Studio 5 (DS-5™) Altera® Edition Toolkit Removes debugging barrier between CPUs and FPGA Exclusive OEM agreement between Altera and ARM Result of innovation in silicon, software, and business model 86
FPGA-Adaptive Debugging Features Single USB-Blaster II cable for simultaneous SW and HW debug Automatic discovery of FPGA peripherals and creation of register views Hardware cross-triggering between the CPU and FPGA domains Correlation of CPU software instructions and FPGA hardware events Simultaneous debug and trace for Cortex-A 9 cores and Core. Sight™-compliant cores in FPGA Statistical analysis of software load and bus traffic spanning the CPUs and FPGA 87
DS-5 Altera Edition Productivity-Boosting Features Industry’s most advanced multicore debugger for ARM JTAG based system-level debugging, gdbserver-based application debugging in one package Yocto plugin to enable Linux based application development Integrated OS-aware analysis and debug capability 88
Visualization of So. C Peripherals Register views assist the debug of FPGA peripherals – File generated by FPGA tool flow – Automatically imported in DS-5 Debugger Debug views for debug of software drivers – Self-documenting – Grouped by peripheral, register and bit-field CMSIS Peripheral register descriptions 89
FPGA-Adaptive, Unified Debugging FPGA connected to debug and trace buses for nonintrusive capture and visualization of signal events Simultaneous debug and trace connection to CPU cores and compatible IP Correlate FPGA signal events with software events and CPU instruction trace using triggers and timestamps 90
Cross-Domain Debug 1 Trigger from software world to FPGA world SOFTWARE TRIGGER HARDWARE TRIGGER! 91
Cross-Domain Debug 2 Trigger from FPGA world to software world HARDWARE TRIGGER EXECUTION STOP OR HW TRACE TRIGGER 92 EXECUTION STOP OR SW TRACE TRIGGER
Correlate HW and SW Events Debug event trigger point ARM® DS-5™ Toolkit set from either: Signal. Tap™ II Logic Analyzer or DS-5 debugger Captured trace can then be analyzed using timestamp-correlated events 93 Timestamp Correlated Signal. Tap II Logic Analyzer
System-Level Performance Analysis Performance bottlenecks in So. Cs often come from the CPU interaction with the rest of the So. C Streamline visualizes software activity with performance counters from the So. C and FPGA to enable full system-level analysis Streamline only requires a TCP/IP connection to the So. C 94 ARM® DS-5™ Streamline Linux OS Counters Processor Counters, Aggregated, or Per Core Power Consumption FPGA Block Counters Process/Thread Heat Map Application Events
Altera So. C EDS- Key Benefits One-stop shop from Altera All the tools and examples for rapid starts Familiar tools interface, easy to use Share tools and knowledge to increase team productivity Best multicore debugger tools for ARM architecture Unprecedented visibility and control across processor cores and across CPU, FPGA domains Faster time to market, lower development costs! 95
Target Users and Usages Web Edition Board Bring-up Yes Device Drivers Dev Yes OS Porting Yes Baremetal Programming Yes RTOS Based App Dev Yes Linux Based App Dev 96 Subscription Edition Yes Multicore App Debugging Yes System Debugging Yes
So. C EDS Editions Summary Component Hardware/Software Handoff Tools ARM DS-5 Altera Edition Web Edition Subscription Edition 30 -Day Evaluation Preloader Image Generator x x x Flash Image Creator x x x Device Tree Generator (Linux) x x x Eclipse IDE x x x Key Feature ARM Compiler* Debugging over Ethernet (Linux) x x x Debugging over USB-Blaster II JTAG x x Automatic FPGA Register Views x x Hardware Cross-triggering x x CPU/FPGA Event Correlation x x x Code. Bench Lite EABI (Bare-metal) x x x Hardware Libraries Bare-metal programming Support x x x So. C Programming Examples Golden System Reference Design x x x Compiler Tool Chains Linaro Tool Chain (Linux) x *ARM Compiler is available in DS-5 Professional Edition, available directly from ARM 97
Coordinated Multi-Channel Delivery Altera. com Quartus II Programmer Signal. Tap II 98 Altera. com Rocket. Boards. org Pre-built Binaries • Kernel • U-Boot • Yocto • Minimal RFS • Tool chains • Handoff tools • HW Libraries • Examples • Documentation Frequent Updates • Kernel source • U-Boot source • Yocto source • RFS source • Toolchain source • Public git • Wiki • Mailman Partners BSPs Middleware 3 rd Party Tools
Altera NIOS Software Design Tools Nios II SBT for Eclipse key features: – New project wizards and software templates – Compiler for C and C++ (GNU) – Source navigator, editor, and debugger – Eclipse project-based tools 99
- Slides: 99