ECE 448 Lab 6 Using Pico Blaze Fast
ECE 448: Lab 6 Using Pico. Blaze Fast Sorting
Agenda for today Part 1: Introduction to Lab 6 Part 2: Instruction Set of Pico. Blaze-6 Part 3: Hands-on Session: FIDEx IDE Part 4: Lab 6 Exercise 1 Part 5: Lab 5 Demos
Part 1 Introduction to Lab 6 ECE 448 – FPGA and ASIC Design with VHDL
Sources • P. Chu, FPGA Prototyping by VHDL Examples Chapter 14, Picoblaze Overview Chapter 15, Picoblaze Assembly Code Development Chapter 16, Picoblaze I/O Interface Chapter 17, Picoblaze Interrupt Interface • K. Chapman, Pico. Blaze for Spartan-6, Virtex-6, 7 -series, Zynq and Ultra. Scale Devices (KCPSM 6) ECE 448 – FPGA and ASIC Design with VHDL 4
INSTRUCTION RAM_wen DATA RAM we R_wen rinit instruction A[7. . 0] din dout interrupt address PICOBLAZE addr Buttons, Switches interrupt_ack INPUT_INTERFACE port_id BUTTON SWITCH out_port in_port A[7. . 0] A[8. . 0] rinit DO[7. . 0] DI[7. . 0] RD[7. . 0] PRNG_STATUS PRNG_CTRL register R_wen rinit RA[7. . 0] read_strobe write_strobe A[8. . 0] CYCLE COUNTER & OUTPUT_INTERFACE Switch S 7 CCOUNT SSD 3 SSD 2 SSD 1 SSD 0 LED Four 7 -segment displays ADDR_DECODER A[8] DO[0] MEM_BANK SSD 3_en, SSD 2_en, SSD 1_en, SSD 0_en, CCOUNT_en, LED_en, MEM_BANK_en, PRNG_CTRL_en, RAM_wen
MEM_BANK: 000 001 002 . . . 7 6 5 4 3 2 1 0 A 8 255 x 8 DATA RAM A 8 – current memory bank number = the most significant bit of the address BUTTON: 7 6 5 4 3 2 1 0 A 0 FE 0 FF 100 101 102 103 104 105 106 107 108 109 1 FE 1 FF MEM_BANK BUTTON SSD 3 SSD 2 SSD 1 SSD 0 LED PRNG_STATUS PRNG_CTRL SWITCH CCOUNT MEM_BANK BS BR BL BU BD A – button active (bit cleared by reading register BUTTON or by interrupt_ack) BS – Select, BR – Right, BL – Left, BU – Up, BD - Down PRNG_STATUS: 7 6 5 4 3 2 1 0 D D – done: bit cleared by writing to register PRNG_CTRL, set after PRNG generates 255 8 -bit numbers PRNG_CTRL: 7 6 5 4 3 2 1 0 I I – initialize: after 1 is written to this bit, PRNG generates 255 8 -bit numbers, and the corresponding address (index) of each number
SWITCH: 7 6 5 4 3 2 1 0 S 7 S 6 S 5 S 4 S 3 S 2 S 1 S 0 S 7 -S 0 – bits corresponding to the state of each switch CCOUNT: 7 6 5 4 3 2 1 0 D S R R – reset the 64 -bit Cycle Counter, and start counting clock cycles S – stop the Cycle Counter D – display the Cycle Counter (Switch S 7 chooses between displaying Least Significant and Most Significant Word) LED: 7 6 5 4 3 2 1 0 L 7 L 6 L 5 L 4 L 3 L 2 L 1 L 0 L 7 -L 0 – bits corresponding to the status of each LED
Task 1 – Browsing Mode (default mode) Two 7 -Segment Displays (in hexadecimal notation) (SSD 3 -SSD 2) Current Address Button Up = Increment Address Button Down = Decrement Address Data 00 01 02 03 04 05 …. FA FB FC FD FE 255 x 8 RAM Value at Current Address Two 7 -Segment Displays (in hexadecimal notation) (SSD 1 -SSD 0)
Task 2 – Initialize Address Button Left = Initialize with Pseudorandom Values Then, return to the browsing mode Data 00 01 02 03 04 05 …. 25 87 94 26 B 5 C 6 …. FA FB FC FD FE 7 A 5 B 34 43 89 255 x 8 RAM
8 -bit LCG (Linear Congruential Generator) with the period of 28 -1 Rn+1 = a * Rn + c (mod m) where R is the sequence of pseudorandom values, a is the multiplier, c is the increment and m is the modulus. R 0 will be the initial seed value. LCG generates one output per 1 clock cycle.
Task 3 – Sorting Address Sorting signed numbers in the descending order Data 00 01 02 03 04 05 …. 7 F 67 53 44 38 2 D …. FA FB FC FD FE B 1 AA 91 80 255 x 8 RAM
Task 4 – Cycle Count Display Mode During Sorting display: “----” on the Seven Segment Displays. After Sorting display: Number of clock cycles used (in the hexadecimal notation) #Cycles 15… 0 - 16 least significant bits #Cycles 31. . 16 - 16 most significant bits Switch between these two values using switch S 7 S 7=0 : 16 least significant bits S 7=1 : 16 most significant bits Pressing any button (other than Select) after sorting, brings the display back to the browsing mode.
Task 5 (Bonus) – Interrupts • Modify your circuit in such a way that it generates an interrupt each time any button is pressed • Modify your assembly language program accordingly, by replacing polling by an interrupt serving routine • Consider using Register Bank switching in your interrupt service routine (if appropriate)
Contest for the Fastest Implementation of Sorting Bonus points will be awarded to students who perform sorting (correctly) using the smallest number of clock cycles. Possible optimizations: • Faster sorting algorithms in software • Efficient assembly language implementation • Faster sorting algorithms in hardware • Efficient hardware implementation
Part 2 Instruction Set of Pico. Blaze-6 ECE 448 – FPGA and ASIC Design with VHDL
Pico. Blaze-3 Programming Model ECE 448 – FPGA and ASIC Design with VHDL 16
Pico. Blaze-6 Programming Model Bank A Bank B FFC FFD FFE FFF ECE 448 – FPGA and ASIC Design with VHDL 17
Syntax and Terminology Syntax Example Definition s. X s 7 Value at register 7 KK ab Value ab (in hex) PORT(KK) PORT(2) PORT((s. X)) PORT((sa)) RAM(KK) RAM(4) Input value from port 2 Input value from port specified by register a Value from RAM location 4
Addressing modes Immediate mode SUB s 7, 07 ADDCY s 2, 08 s 7 – 07 => s 7 s 2 + 08 + C => s 2 Direct mode ADD sa, sf INPUT s 5, 2 a Indirect mode STORE s 3, (sa) INPUT s 9, (s 2) sa + sf => sa PORT(2 a) => s 5 s 3 => RAM((sa)) PORT((s 2)) => s 9
Arithmetic Instructions (1) CZ IMM, DIR Addition ADD s. X, s. Y s. X + s. Y => s. X ADD s. X, KK s. X + KK => s. X ADDCY s. X, s. Y s. X + s. Y + CARRY => s. X ADDCY s. X, KK s. X + KK + CARRY => s. X
Arithmetic Instructions (2) CZ Subtraction SUB s. X, s. Y s. X – s. Y => s. X SUB s. X, KK s. X – KK => s. X SUBCY s. X, s. Y s. X – s. Y – CARRY => s. X SUBCY s. X, KK s. X – KK – CARRY => s. X IMM, DIR
Test and Compare Instructions CZ TEST IMM, DIR TEST s. X, s. Y s. X and s. Y => none C = odd parity of TEST s. X, KK the result s. X and KK => none COMPARE s. X, s. Y s. X – s. Y => none COMPARE s. X, KK s. X – KK => none IMM, DIR
Data Movement Instructions (1) CZ LOAD IMM, DIR LOAD s. X, s. Y => s. X LOAD s. X, KK KK => s. X - -
Data Movement Instructions (2) CZ DIR, IND STORE - - STORE s. X, KK s. X => RAM(KK) STORE s. X, (s. Y) s. X => RAM((s. Y)) DIR, IND FETCH s. X, KK RAM(KK) => s. X FETCH s. X, (s. Y) RAM((s. Y)) => s. X - -
Example 1: Clear Data RAM ; ============================= ; routine: clr_data_mem ; function: clear data ram ; temp register: data, s 2 ; ============================= clr_data_mem: load s 2, 40 ; unitize loop index to 64 load s 0, 00 clr_mem_loop: store s 0, (s 2) sub s 2, 01 ; dec loop index jump nz, clr_mem_loop ; repeat until s 2=0 return
Data Movement Instructions (3) INPUT CZ DIR, IND - - INPUT s. X, KK s. X <= PORT(KK) INPUT s. X, (s. Y) s. X <= PORT((s. Y)) OUTPUT s. X, KK PORT(KK) <= s. X OUTPUT s. X, (s. Y) PORT((s. Y)) <= s. X
Edit instructions - Shifts *All shift instructions affect Zero and Carry flags
Edit instructions - Rotations *All rotate instructions affect Zero and Carry flags
Program Flow Control Instructions (1) JUMP AAA PC <= AAA JUMP C, AAA if C=1 then PC <= AAA else PC <= PC + 1 JUMP NC, AAA if C=0 then PC <= AAA else PC <= PC + 1 JUMP Z, AAA if Z=1 then PC <= AAA else PC <= PC + 1 JUMP NZ, AAA if Z=0 then PC <= AAA else PC <= PC + 1
Program Flow Control Instructions (2) CALL AAA TOS <= TOS+1; STACK[TOS] <= PC; PC <= AAA CALL C | Z , AAA if C | Z =1 then TOS <= TOS+1; STACK[TOS] <= PC; PC <= AAA else PC <= PC + 1 CALL NC | NZ , AAA if C | Z =0 then TOS <= TOS+1; STACK[TOS] <= PC; PC <= AAA else PC <= PC + 1
Program Flow Control Instructions (3) RETURN PC <= STACK[TOS] + 1; TOS <= TOS - 1 RETURN C | Z if C | Z =1 then PC <= STACK[TOS] + 1; TOS <= TOS - 1 else PC <= PC + 1 RETURN NC | NZ if C | Z =0 then PC <= STACK[TOS] + 1; TOS <= TOS - 1 else PC <= PC + 1
Subroutine Call Flow
Part 3 Hands-on Session: FIDEx IDE ECE 448 – FPGA and ASIC Design with VHDL
KCPSM 6 Assembler (book, Xilinx download) KCPSM 6. EXE ECE 448 – FPGA and ASIC Design with VHDL 34
Differences between Mnemonics of Instructions (book) ECE 448 – FPGA and ASIC Design with VHDL FIDEx IDE 35
Differences between Mnemonics of Instructions ECE 448 – FPGA and ASIC Design with VHDL 36
Numeric Formats (FIDEx IDE) Hexadecimal: 0 x 3 A Binary: B 00111010 or B 00111010 Octal: 72 Decimal: 58 37
Assembler Directives (FIDEx IDE) #EQU your. Constant, 0 x 3 A ; defines your constant #EQU your. Reg. Name, s 0 ; renames a Pico. Blaze register #ORG ADDR, n ; sets the memory address of the following instruction to n #DEFINE case 0 ; chooses among multiple program variants #IFDEF case 0 …. . #ELSEIF case 1 | case 2. . . #ENDIF 38
Example & Demo of Tools ECE 448 – FPGA and ASIC Design with VHDL 39
Part 4 Lab 6 Exercise 1 ECE 448 – FPGA and ASIC Design with VHDL 40
Linear Congruential Generator (LCG) • Develop an assembly language implementation of a Linear Congruential Generator (LCG) producing a sequence of 8 -bit pseudo-random numbers. • Use FIDEx IDE to debug and simulate your program. • Recurrence relation • Rn+1 = a * Rn + c (mod m), where Ø m = 28 Ø a=0 x 11; c=0 x 9 D; R 0=0 x. D 7 • Additionally, assume that * represents an unsigned multiplication 41
Notation a Multiplicand ak-1 ak-2. . . a 1 a 0 x Multiplier xk-1 xk-2. . . x 1 x 0 p Product (a * x) p 2 k-1 p 2 k-2. . . p 2 p 1 p 0 42
Multiplication of two 4 -bit unsigned binary numbers Partial Product 0 Partial Product 1 Partial Product 2 Partial Product 3 43
Unsigned Multiplication – Basic Equat k-1 p = a * x x = xi * 2 i i=0 k-1 p = a * x = a * xi * 2 i = i=0 = x 0 a 20 + x 1 a 21 + x 2 a 22 + … + xk-1 a 2 k-1 44
Iterative Algorithm for Unsigned Multiplic Shift/Add Algorithm p = a * x = x 0 a 20 + x 1 a 21 + x 2 a 22 + … + xk-1=a 2 k-1 = (. . . ((0 + x 0 a 2 k)/2 + x 1 a 2 k)/2 +. . . + xk-1 a 2 k)/2 = k times p(0) = 0 p(j+1) = (p(j) + xj a 2 k) / 2 j=0. . k-1 p = p(k) 45
Iterative Algorithm for Unsigned Multipli Shift/Add Algorithm p = a * x = x 0 a 20 + x 1 a 21 + x 2 a 22 + … + x = 7 a 27 = (. . . ((0 + x 0 a 28)/2 + x 1 a 28)/2 +. . . + x 7 a 28)/2 = 8 times p(0) = 0 p(j+1) = (p(j) + xj a 28) / 2 j=0. . 7 p = p(k) 46
Unsigned Multiplication Computation 8 bits p. H p. L xj a + p. L >> 1 C p(j) + xj a 28 p. H C p p. L p. H = s 5 p. L = s 6 2 p(j+1) Pico. Blaze Registers a = s 3 x = s 4 47
Unsigned Multiplication Subroutine (1 ; ========================= ; routine: mult_soft ; function: 8 -bit unsigned multiplier using ; shift-and-add algorithm ; input register: ; s 3: multiplicand ; s 4: multiplier ; output register: ; s 5: upper byte of product ; s 6: lower byte of product ; temporary register: ; s 2: index j ; ========================= 48
Unsigned Multiplication Subroutine ( mult_soft: load s 5, 00 ; clear p. H load s 2, 08 ; initialize loop index mult_loop: sr 0 s 4 ; shift lsb of x to carry jump nc, shift_prod ; x_j is 0 add s 5, s 3 ; x_j is 1, p. H=p. H+a shift_prod: sra s 5 ; shift upper byte p. H right, ; carry to MSB, LSB to carry sra s 6 ; shift lower byte p. L right, ; lsb of p. H to MSB of p. L sub s 2, 01 ; dec loop index jump nz, mult_loop ; repeat until i=0 return 49
Part 5 Lab 5 Demos ECE 448 – FPGA and ASIC Design with VHDL 50
- Slides: 50