ECE 448 Lab 5 Using Fpro So C
ECE 448: Lab 5 Using Fpro So. C with Hardware Accelerators Fast Sorting (cont. )
SORTING: Design Based on Block Memory with Synchronous Read
Basic MIMO I/O Core Construction • • • Design the custom digital circuit Determine the I/O register map for the slot interface Derive the wrapping circuit Develop the hardware Develop the software driver Developing an application based on the driver
Pseudocode wait for s=1 for i=0 to k-2 do for j=i+1 to k-1 do if Mi > Mj then Swap Mi with Mj end if end for Done wait for s=0 go to the beginning
Block Diagram Li Ei Lj Ej Data. Out
Resetn S 0 0 S 1 S 2 ASM Chart i=0 s 1 j=i+1 Read Mi Read Mj S 3 j++ Mi>Mj i++ T F S 4 Swap Mi with Mj 0 F s j=k-1 T F i=k-2 T S 5 Done 1
Basic MIMO I/O Core Construction • • • Design the custom digital circuit Determine the I/O register map for the slot interface Derive the wrapping circuit Develop the hardware Develop the software driver Developing an application based on the driver
Reg 2 Reg 3 N=8 for the number of memory elements to sort = 2 L, where L=4. . 8 N=16 for the number of memory elements to sort = 2 L, where L=9. . 16 Suggested memory sizes: Debugging: Demo: 24 x 8 28 x 8
Basic MIMO I/O Core Construction • • • Design the custom digital circuit Determine the I/O register map for the slot interface Derive the wrapping circuit Develop the hardware Develop the software driver Developing an application based on the driver
An Entity Declaration of an MMIO Core Sorting
MIMO Slot Interface Specification • A slot is a 32 -word (25 -word) memory module. The slot interface is defined as follows: • addr (bus to core). A 5 -bit address signal used to identify the 32 -bit destination I/O register within the core. • rd_data (core to bus). A 32 -bit signal carrying the read data. • wr_data (bus to core). A 32 -bit signal carrying the write data. • read (bus to core). A 1 -bit control signal activated with the read operation. • write (bus to core). A 1 -bit control signal to enable the register write. • cs (bus to core). A 1 -bit enable (i. e. , “chip select”) signal to select and activate the core.
Reg 2 Reg 3 N=8 for the number of memory elements to sort = 2 L, where L=4. . 8 N=16 for the number of memory elements to sort = 2 L, where L=9. . 16 Suggested memory sizes: Debugging: Demo: 24 x 8 28 x 8
wr_data[1. . 0] En_datain En_dataout=cs・read・(not addr(0)) num_values ctrl_reg(0)=go EN wr_data[15. . 0] Reg 1 ctrl_reg(0)=go ctrl_reg(1)=clr EN k-1 Reg 0 -1 k-2 counteri countera i Wr. Init counterj j Reg 3 k-2 == k-1 ==
Interface with the division into the Datapath and Controller wr_data addr cs read write 32 clk reset 5 ctrl_reg(0)=go zi zj Agt. B Datapath Controller Wr Li Ei Lj Ej done 32 rd_data
Testing Routine (1) 1. Initialize 256 locations of two arrays, e. g. , sw_data and hw_data, with the identical contents composed of random 8 -bit unsigned integers 2. Sort sw_data using a simple software function. 3. Set the clr flag of Reg 1 of Sorting unit to 1 4. Set the clr flag of Reg 1 of Sorting unit to 0 5. Transfer data from the array hw_data to the Sorting unit
Testing Routine (2) 6. Set the go flag of Reg 1 of Sorting unit to 1 7. Wait until the done flag of Reg 3 is equal to 1 8. Set the go flag of Reg 1 of Sorting unit to 0 9. Set the clr flag of Reg 1 of Sorting unit to 1 10. Set the clr flag of Reg 1 of Sorting unit to 0 11. Transfer data from the Sorting unit to the array hw_data 12. Compare the contents of hw_data and sw_data, count the number of mismatches
Testing Routine (3) 13. Display the result of comparison 1 -equal, 0 -not equal using the first seven segment display. Display the number of mismatches on the second seven segment display. 14. Exit or wait for a button to be pressed.
Example of C Code (1) 1. Initialize 256 locations of two arrays, e. g. , sw_data and hw_data, with the identical contents composed of random 8 -bit unsigned integers #include <stdio. h> #include <stdlib. h> #include <inttypes. h> #define MAX_SIZE 65536 uint 8_t sw_data[MAX_SIZE]; uint 32_t hw_data[MAX_SIZE]; int main() { uint 32_t num_elements = 256; … for (i=0; i<num_elements; i++) { sw_data[i] = (uint 8_t)(rand()%256); hw_data[i] = (uint 32_t) sw_data[i]); … }
Data Types in inttypes. h • • int 8_t: signed 8 -bit integer uint 8_t: unsigned 8 -bit integer int 16_t: signed 16 -bit integer uint 16_t: unsigned 16 -bit integer int 32_t: signed 32 -bit integer uint 32_t: unsigned 32 -bit integer int 64_t: signed 64 -bit integer uint 64_t: unsigned 64 -bit integer
Example of C Code (2 a) 3. Set the clr flag of Reg 1 of Sorting unit to 1 4. Set the clr flag of Reg 1 of Sorting unit to 0 #include "chu_io_rw. h" #include "chu_io_map. h" #define ctrl_clr_go_offset 1 #define CLR_FIELD 2 uint 32_t sort_slot_address = get_slot_addr(BRIDGE_BASE, S 4_USER); int main() { … io_write(sort_slot_address, ctrl_clr_go_offset, CLR_FIELD); io_write(sort_slot_address, ctrl_clr_go_offset, 0); … }
I/O Macros in chu_io_rw. h
Instantiating the sorting core in mmio_sys_[vanilla or user specified]. vhd
Using an existing offset for sorting core in chu_io_map. vhd
Instantiating the sorting core in mmio_sys_[vanilla or user specified]. vhd
Adding a new slot offset for sorting core in chu_io_map. vhd
Slot and constant definitions in chu_io_map. h
Example of C Code (2 b) 3. Set the clr flag of Reg 1 of Sorting unit to 1 4. Set the clr flag of Reg 1 of Sorting unit to 0 #include "chu_io_rw. h" #include "chu_io_map. h” #define ctrl_clr_go_offset 1 #define CLR_FIELD 2 uint 32_t sort_slot_address = get_slot_addr(BRIDGE_BASE, S 14_SORT); int main() { … io_write(sort_slot_address, ctrl_clr_go_offset, CLR_FIELD); io_write(sort_slot_address, ctrl_clr_go_offset, 0); … }
Example of C Code (3) 5. Transfer data from the array hw_data to the Sorting unit #include "chu_io_rw. h" #include "chu_io_map. h” #define data_in_offset 0 #define MAX_SIZE 65536 uint 32_t sort_slot_address = get_slot_addr(BRIDGE_BASE, S 4_USER); uint 32_t hw_data[MAX_SIZE]; int main() { uint 32_t num_elements = 256; … for (i=0; i<num_elements; i++) { io_write(sort_slot_address, data_in_offset, hw_data[i]); … }
Example of C Code (4) 6. Set the go flag of Reg 1 of Sorting unit to 1 7. Wait until the done flag of Reg 3 is equal to 1 8. Set the go flag of Reg 1 of Sorting unit to 0 #include "chu_io_rw. h" #include "chu_io_map. h” #define ctrl_clr_go_offset 1 #define done_offset 3 #define GO_FIELD 1 #define DONE_FIELD 1 uint 32_t sort_slot_address = get_slot_addr(BRIDGE_BASE, S 4_USER); int main() { … io_write(sort_slot_address, ctrl_clr_go_offset, GO_FIELD); while (io_read(sort_slot_address, ctrl_done_offset)!=DONE_FIELD); io_write(sort_slot_address, ctrl_clr_go_offset, 0); … }
Example of C Code (5) 11. Transfer data from the Sorting unit to the array hw_data #include "chu_io_rw. h" #include "chu_io_map. h" #define data_out_offset 2 #define MAX_SIZE 65536 uint 32_t sort_slot_address = get_slot_addr(BRIDGE_BASE, S 4_USER); uint 32_t hw_data[MAX_SIZE]; int main() { uint 32_t num_elements = 256; … for (i=0; i<num_elements; i++) { hw_data[i] = io_read(sort_slot_address, data_out_offset); … }
Example of C Code (6) Reading the log 2 of the number of elements to sort using switches #include "chu_io_rw. h" #include "chu_io_map. h” #include ”gpio_core. h” #define MAX_SIZE 65536 uint 32_t sort_slot_address = get_slot_addr(BRIDGE_BASE, S 4_USER); uint 32_t hw_data[MAX_SIZE]; int main() { uint 32_t num_elements; uint 16_t lognum; … Gpi. Core sw(get_slot_addr(BRIDGE_BASE, S 3_SW)); lognum = sw. read(); num_elements = 1 << lognum; … }
Gpi. Core class definition in gpio_core. h
Gpi. Core class implementation in gpio_core. cpp
Timer. Core class definition in timer_core. h
Timer. Core class implementation in timer_core. cpp
Timer Core • The processor interacts with the counter as follows: § retrieve (i. e. , read) the 48 -bit counter value. § set or reset (i. e. , write) the go signal to resume or pause the counting. § generate (i. e. , write) a pulse to clear the counter to 0. § Register Map § offset 0 (lower word of the counter) – bits 31 to 0: 32 LSBs of the counter § offset 1 (upper word of the counter) – bits 15 to 0: 16 MSBs of the counter § offset 2 (control register) § –bit 0: the go signal of the counter § –bit 1: the clear signal of the counter
- Slides: 36