Hardware Description Language Logic Design using Verilog TsungChu

  • Slides: 50
Download presentation
Hardware Description Language -- Logic Design using Verilog Tsung-Chu Huang Dept. of Electronic Eng.

Hardware Description Language -- Logic Design using Verilog Tsung-Chu Huang Dept. of Electronic Eng. National Changhua University of Ed. Email: tch@cc. ncue. edu. tw 2015/10/15 HDL T. -C. Huang / NCUE Fall 2015 1

Outline Ø Ø HDL T. -C. Huang / NCUE Fall 2015 RTL or Behavioral

Outline Ø Ø HDL T. -C. Huang / NCUE Fall 2015 RTL or Behavioral Two Major Phases of HDL Applicative Tools Related to HDL ASM-Based Synthesis using an Example 2

Which is Better – RTL or Behavior Rethink using Binary Counter and Gray Counter

Which is Better – RTL or Behavior Rethink using Binary Counter and Gray Counter Q 5 Q 4 Q 3 Q 2 Q 1 Q 0 module input output wire Binary(Clk, Q); Clk; [n-1: 0] Qb; always@(posedge Clk) Q=Q+1; Low Power Channel endmodule input output wire Gray(Clk, Q); Clk; [n-1: 0] Qb; binary_counter (Clk, Qb); xor_array (Q, Qb); endmodule Regular structure or regular behavior? HDL T. -C. Huang / NCUE Fall 2015 3

Two Major Phases Ø Simulation 1. Usually simulated along time axis (initial) 2. With

Two Major Phases Ø Simulation 1. Usually simulated along time axis (initial) 2. With stimuli Ø Synthesis 1. Continuously exists • Gate-Level (Structural View) • RTL ( assign L = f(R); ) 2. Always Procedural (Behavioral Model) HDL T. -C. Huang / NCUE Fall 2015 4

Sequential Circuits Ø Huffman Model: Combinational Circuits with a set of DFFs 1. Combinational

Sequential Circuits Ø Huffman Model: Combinational Circuits with a set of DFFs 1. Combinational Network is a DAG (Directed Acyclic Graph) 2. 1 -Cycle FSM with additional Reset cycle. 3. Can be expanded to an iterative logic array (ILA) that is a combinational circuit, i. e. trivial 0 -cycle FSM. 4. Long critical paths can be spitted by inserted DFFs. Critical-Path Split Seq. Ckt Datapath ASM Huffman Model Pure FSM ILA Pipelined Datapath Combinational Circuits HDL T. -C. Huang / NCUE Fall 2015 5

Considering Subtractors A Bout B – Bin module Sub(A, B, Bi, D, Bo); input

Considering Subtractors A Bout B – Bin module Sub(A, B, Bi, D, Bo); input A[n-1: 0], Bi; output D[n-1: 0], Bo; reg D[n-1: 0], Bo; assign {Bo, Bi}=A-B-Bi; endmodule D Ø Ø HDL Boolean Function Based Design: Too Complex Half-Subtractor→Full-Subtractor→n-bit Subtractor Parallel Design Modified from Adder: • S = A + B if Sub==0 • S =A - B = A + (-B) = A + (~B+1) if Sub==1 → D = S = A + (Sub ? ~B : B) + Sub T. -C. Huang / NCUE Fall 2015 6

6 -Bit Parallel Multiplier 2 3 4 1 0 1 1 =B X) 1

6 -Bit Parallel Multiplier 2 3 4 1 0 1 1 =B X) 1 2 3 X) 1 1 0 1 =A 7 0 2 4 6 8 2 3 4 2 8 7 8 2 HDL T. -C. Huang / NCUE Fall 2015 1→ 1 0 1 1 0→ 0 0 1→ 1 0 1 1 1 0 0 0 1 1 Product P=Bx. A 7

Multiplier Cell module MULij(Ai, Bj, Ci, Co, Pi, Po); input Ai, Bj, Ci; output

Multiplier Cell module MULij(Ai, Bj, Ci, Co, Pi, Po); input Ai, Bj, Ci; output Co, Pij; Bj and (AB, Ai, Bj); Full. Adder (Pi, Ci, AB, Co, Po); endmodule Ai + Cij Ci(j-1) Pij HDL T. -C. Huang / NCUE Fall 2015 8

6 -Bit Parallel Multiplier HDL T. -C. Huang / NCUE Fall 2015 9

6 -Bit Parallel Multiplier HDL T. -C. Huang / NCUE Fall 2015 9

Coding by V 95 #include <stdio. h> MCell M 0(A[0], B[0], 0, C[0], 0,

Coding by V 95 #include <stdio. h> MCell M 0(A[0], B[0], 0, C[0], 0, S[0]); #include <stdlib. h> MCell M 1(A[0], B[1], C[0], C[1], 0, S[1]); main() MCell M 2(A[0], B[2], C[1], C[2], 0, S[2]); { MCell M 3(A[0], B[3], C[2], C[3], 0, S[3]); int i, j, N=6; MCell M 4(A[0], B[4], C[3], C[4], 0, S[4]); MCell M 5(A[0], B[5], C[4], C[5], 0, S[5]); for(i=0; i<N; i++) { MCell M 6(A[1], B[0], 0, C[6], S[1], S[6]); for(j=0; j<N; j++) { MCell M 7(A[1], B[1], C[6], C[7], S[2], S[7]); printf("MCell M%d(A[%d], B[%d], ", i*N+j, i, j); MCell M 8(A[1], B[2], C[8], S[3], S[8]); if(j==0) printf("0, "); else printf("C[%d], ", C[7], i*N+j-1); MCell M 9(A[1], B[3], C[8], C[9], S[4], S[9]); printf("C[%d], ", i*N+j); MCell M 10(A[1], B[4], C[9], C[10], S[5], S[10]); if(i==0) printf("0, "); MCell M 11(A[1], B[5], C[10], C[11], C[5], S[11]); else if(j==N-1) printf("C[%d], (i-1)*N+j); MCell ", M 12(A[2], B[0], 0, C[12], S[7], S[12]); MCell M 13(A[2], B[1], C[12], C[13], S[8], S[13]); else printf("S[%d], ", (i-1)*N+j+1); MCell M 14(A[2], B[2], C[13], C[14], S[9], S[14]); printf("S[%d]); n", i*N+j); MCell M 15(A[2], B[3], C[14], C[15], S[10], } S[15]); } MCell M 16(A[2], B[4], C[15], C[16], S[11], } S[16]); MCell M 17(A[2], B[5], C[16], C[17], C[11], S[17]); MCell M 18(A[3], B[0], 0, C[18], S[13], S[18]); MCell M 19(A[3], B[1], C[18], C[19], S[14], S[19]); MCell M 20(A[3], B[2], C[19], C[20], S[15], S[20]); MCell M 21(A[3], B[3], C[20], C[21], S[16], S[21]); MCell M 22(A[3], B[4], C[21], C[22], S[17], S[22]); MCell M 23(A[3], B[5], C[22], C[23], C[17], S[23]); MCell M 24(A[4], B[0], 0, C[24], S[19], S[24]); MCell M 25(A[4], B[1], C[24], C[25], S[20], S[25]); MCell M 26(A[4], B[2], C[25], C[26], S[21], HDL T. -C. Huang / NCUE Fall 2015 S[26]); 10

Enhancement in IEEE 1364 -2001 V 2 K 1. Design management—Verilog configurations 2. Scalable

Enhancement in IEEE 1364 -2001 V 2 K 1. Design management—Verilog configurations 2. Scalable models—Verilog generate 3. Constant functions 4. Indexed vector part selects 5. Multidimensional arrays (reg [7: 0] a[1: 4][0: 255]; ) 6. Bit and part selects within arrays 7. Signed arithmetic extensions (eg. reg signed [15: 0] x; >>>) 8. Power operator ** (xor ^) 9. Re-entrant tasks and recursive functions 10. Combinational logic sensitivity token (always@*) 11. Comma-separated sensitivity lists 12. Automatic width extension beyond 32 bits 13. Enhanced file I/O 14. In-line parameter passing by name 15. Combined port and data type declarations 16. ANSI-style input and output declarations 17. reg declaration initial assignments HDL T. -C. Huang / NCUE Fall 2015 11

Coding by V 2 K // Nx. N Parallel Multiplier // Based on IEEE

Coding by V 2 K // Nx. N Parallel Multiplier // Based on IEEE Standard 1364 -2001 module MUL(A, B, P); parameter N = 8; input [N-1: 0] A, B; output [2*N-1: 0] P; wire Ci[0: N-1], Co[0: N-1], Si[0: N-1], So[0: N-1]; genvar i, j; for(i=0; i<N; i=i+1) begin : Row for(j=0; j<N; j=j+1) begin : Col MCell( A[i], B[j], Ci[i][j], Co[i][j], Si[i][j], So[i][j]); if(j==0) begin assign Ci[i][j]=1'b 0; assign P[i] = So[i][j]; end else begin assign Ci[i][j]=Co[i][j-1]; if(i==N-1) assign P[N+j-1] = So[i][j]; end if(i==0) assign Si[i][j]=1'b 0; else if(j==N-1) assign Si[i][j]=Co[i-1][j]; else assign Si[i][j] = So[i-1][j+1]; end module MCell(A, B, Ci, Co, Si, So); end input A, B, Ci, Si; assign P[2*N-1] = Co[N-1]; output Co, So; wire D, DC, CS, SD; endmodule and g 1(D, A, B); xor g 2(So, D, Ci, Si); and g 3(DC, D, Ci); and g 4(CS, Ci, Si); and g 5(SD, Si, D); or g 6(Co, DC, CS, SD); endmodule HDL T. -C. Huang / NCUE Fall 2015 12

Coding by V 2 K genvar i, j; for(i=0; i<N; i=i+1) begin : Row

Coding by V 2 K genvar i, j; for(i=0; i<N; i=i+1) begin : Row for(j=0; j<N; j=j+1) begin : Col MCell( A[i], B[j], Ci[i][j], Co[i][j], Si[i][j], So[i][j]); end endmodule HDL T. -C. Huang / NCUE Fall 2015 13

Coding by V 2 K genvar i, j; for(i=0; i<N; i=i+1) begin : Row

Coding by V 2 K genvar i, j; for(i=0; i<N; i=i+1) begin : Row for(j=0; j<N; j=j+1) begin : Col if(j==0) begin assign Ci[i][j]=1'b 0; assign P[i] = So[i][j]; end HDL T. -C. Huang / NCUE Fall 2015 14

Coding by V 2 K genvar i, j; for(i=0; i<N; i=i+1) begin : Row

Coding by V 2 K genvar i, j; for(i=0; i<N; i=i+1) begin : Row for(j=0; j<N; j=j+1) begin : Col if(j==0) else begin assign Ci[i][j]=Co[i][j-1]; if(i==N-1) assign P[N+j-1] = So[i][j]; end HDL T. -C. Huang / NCUE Fall 2015 15

Coding by V 2 K genvar i, j; for(i=0; i<N; i=i+1) begin : Row

Coding by V 2 K genvar i, j; for(i=0; i<N; i=i+1) begin : Row for(j=0; j<N; j=j+1) begin : Col if(i==0) assign Si[i][j]=1'b 0; else if(j==N-1) assign Si[i][j]=Co[i-1][j]; else assign Si[i][j] = So[i-1][j+1]; end assign T. -C. Huang /P[2*N-1] NCUE Fall 2015 = Co[N-1]; HDL 16

Big-Data Verification `timescale 1 ns/10 ps module Test_MUL; parameter N = 8; reg [N-1:

Big-Data Verification `timescale 1 ns/10 ps module Test_MUL; parameter N = 8; reg [N-1: 0] A, B; wire [2*N-1: 0] P; MUL U 1(A, B, P); integer i, j, ok; initial begin $monitor($time, ", ", A, ", ", B, ", ", P); ok=1; for(i=0; i < 1<<N ; i=i+1) for(j=0; j < 1<<N ; j=j+1) begin A=i; B=j; #10; if(P != i*j) ok=0; end $display("Varification: %s. ", ok ? "Ok" : "Failed"); $stop; VSIM 3> run end # 0, 0, 0, 0 : endmodule : # 655340, 255, 254, 64770 # 655350, 255, 65025 # Varification: Ok. # Break in Module Test_MUL at D: /work/Mul. Array/Test_MUL. v line 20 HDL T. -C. Huang / NCUE Fall 2015 VSIM 4> | 17

Layout View: 6 -Bit Parallel Multiplier HDL T. -C. Huang / NCUE Fall 2015

Layout View: 6 -Bit Parallel Multiplier HDL T. -C. Huang / NCUE Fall 2015 18

N-bit Shifter 1 Shift_Left 1 0 1 0 1 0 Shift_Left 2 0 1

N-bit Shifter 1 Shift_Left 1 0 1 0 1 0 Shift_Left 2 0 1 0 1 1 Shift_Left 3 0 1 0 1 Constant Shifter Shift Register HDL T. -C. Huang / NCUE Fall 2015 19

Algorithmic State Machine Ø Ø HDL A special design style of general finite state

Algorithmic State Machine Ø Ø HDL A special design style of general finite state machines. It can be designed by mapping from the flow-chart. It can be synthesized manually. High area overhead and low performance due multiplexer stack. 1. Describe your system in pseudo code; 2. Draw the flow-chart; 3. Map the flow-chart to ASM chart; 4. Map the ASM chart to the HDL or circuit. T. -C. Huang / NCUE Fall 2015 20

Pseudo Code • • HDL A high-level, almostexecutable description Used for leader’s instruction, explanation

Pseudo Code • • HDL A high-level, almostexecutable description Used for leader’s instruction, explanation of algorithms, etc. Pascal-like or C-like codes with some comprehensive English is preferable. Example: T. -C. Huang / NCUE Fall 2015 21

Basic Flowchart Elements HDL • Flow and Direction: • Decision: • Process: • Terminals:

Basic Flowchart Elements HDL • Flow and Direction: • Decision: • Process: • Terminals: begin • (Card) Input: card in • Specific Outputs: T. -C. Huang / NCUE Fall 2015 condition N process tapeout Y end report display 22

Sequence Control if-then-else if ( C ) { Y } else { N }

Sequence Control if-then-else if ( C ) { Y } else { N } Condition Y Y N N HDL T. -C. Huang / NCUE Fall 2015 23

Sequence Control switch c 1 2 N if (expr==const 1) {P 1} else if

Sequence Control switch c 1 2 N if (expr==const 1) {P 1} else if (expr==const 2) {P 2} else if (expr==const 3) {P 3} else if (expr==const 4) {P 4} else {Pdefault} 3 switch (expr) { case const 1: case const 2: case const 3: case cosnt 4: . . . default: } HDL T. -C. Huang / NCUE Fall 2015 {P 1} {P 2} {P 3} {P 4} break; {Pdefault} 24

Sequence Control For-loop Initialization for ( I ; C ; N ) { P

Sequence Control For-loop Initialization for ( I ; C ; N ) { P } Y Condition Process Next N for(i=1; i<=9; i++) { for(j=1; j<=9; j++) printf(“%d. X%d=%2 dt”, i, j, i*j); printf(“n”); } HDL T. -C. Huang / NCUE Fall 2015 25

Sequence Control Do-While-Loop and While-Do-Loop do { P } while( C } ; while

Sequence Control Do-While-Loop and While-Do-Loop do { P } while( C } ; while ( ) C { P }; Process Condition N HDL T. -C. Huang / NCUE Fall 2015 Y Condition Y Process N 26

Map Flowchart to ASM Chart Ø Split Processes in flowchart into States that each

Map Flowchart to ASM Chart Ø Split Processes in flowchart into States that each state can be executed in 1 clock cycle. Ø Note that a WR-dependent process should be split into two states. a = b + c; State 1: a = b + c; (write first then read) x = a * d; State 2: x = a * d; x = a + d; (read first then write) a = b * c; HDL T. -C. Huang / NCUE Fall 2015 State 1: x = a + d; a = b * c; 27

Map Decision to Demultiplexer Ø If the condition is a 1 -bit variable, directly

Map Decision to Demultiplexer Ø If the condition is a 1 -bit variable, directly map the decision to a demultiplexer that is a combinational gate (i. e. , it takes a delay less than the clock cycle). C Y 1 C N 0 Ø The clock state of a decision is belong to the last process, therefore the last process should be split into 2 states if there is a WR-dependency. C = A & B; C N HDL T. -C. Huang / NCUE Fall 2015 (Dummy) Y 1 C 0 28

Map Decision with Expression to Diamond Ø If the condition is an expression instead

Map Decision with Expression to Diamond Ø If the condition is an expression instead of a direct input, map the decision to a a diamond. Ø A diamond is composed of a demultiplexer and an extra process. C=f( ) N Y 1 f(C) f( ) C 0 1 0 Ø When the dealy of combinational circuit f( ) is too large, it is better to split it to two states with a demultiplexer. C = f( ); (Dummy) 1 C HDL T. -C. Huang / NCUE Fall 2015 0 29

Example 8 -bit Multiplier X Y N Rst C A B Clk HDL T.

Example 8 -bit Multiplier X Y N Rst C A B Clk HDL T. -C. Huang / NCUE Fall 2015 30

For Instance begin N← 0; A← 0; B←Y; C← 0 B[0] No Yes {C,

For Instance begin N← 0; A← 0; B←Y; C← 0 B[0] No Yes {C, A}←A + X {C, A, B}←{C, A, B}>>1 N←N+1 No N=0 Yes end HDL T. -C. Huang / NCUE Fall 2015 X=151 1 0 0 1 1 1 1 0 0 1 Y=165 N 000 00 0 0 0 010100101 B 00001001011110100101 + 00100100101111010010 → 00100100101111010010 01000010010111101001 → 01001011110011101001 + 01100101111001110100 → 01100101111001110100 10000010111100111010 → 100000101111001110100001011110011101 → 1010111010011101 + 11000101011101001110 → 110001010111010011100010101110100111 → 11101100001010100111 + 00000110000101010011 → 31

Example of Verification using C #include <stdio. h> main() { unsigned char X, Y,

Example of Verification using C #include <stdio. h> main() { unsigned char X, Y, C, A, B, N, Verified; int i, P, Q, Acc, S; Verified = 1; for(i=0; i<(1<<16); i++){ X=i/(1<<8); Y=i%(1<<8); P=X*Y; // Golden Circuit // Algorithm Started Here N=0; A=0; B=Y; C=0; // Initialization do { // Accumulate if(B%2) {Acc = A + C*256 + X; A=Acc%256; C=(Acc>>8)%2; } // Shift CAB S = C*65536 + A*256 + B; S = S >> 1; B=S%256; A=(S>>8)%256; C=(S>>16)%2; N++; } while(N!=8); Q=(A<<8)+B; if(Q!=P) { Verified=0; printf("Error when X=%d, Y=%d, Q=%dn", X, Y, Q); break; } } HDL } if(Verified) printf("The circuit is exhaustively functionally verified. n"); T. -C. Huang / NCUE Fall 2015 32

Example: Verified using Assembly Language BEGIN: LOOP: SHIFT: MOV CLR MOV JE ADC SHR

Example: Verified using Assembly Language BEGIN: LOOP: SHIFT: MOV CLR MOV JE ADC SHR DEC JNZ END N, #States A C B, X SHIFT Y CAB N LOOP C and Assembly codes can be the fast logic simulator. HDL T. -C. Huang / NCUE Fall 2015 33

Example: 8 -bit Multiplier begin Initial: N← 0; A← 0; B←Y; C← 0 begin

Example: 8 -bit Multiplier begin Initial: N← 0; A← 0; B←Y; C← 0 begin N← 0; A← 0; B←Y; C← 0 B[0] No Yes {C, A}←A + X {C, A, B}←{C, A, B}>>1 N←N+1 No N=0 Yes end dummy B[0] 1 Accumulate: {C, A}←A + X Shift: {C, A, B}←{C, A, B}>>1; N←N+1 Dummy 2: Dummy 0 N=0 1 end ASM Chart HDL T. -C. Huang / NCUE Fall 2015 0 34

Map ASM Elements to Sequencer D D PRE begin State Q CLR Q State

Map ASM Elements to Sequencer D D PRE begin State Q CLR Q State 1 C C 0 1 0 f( ) 1 f( ) C 0 1 0 end D CLR Q HDL T. -C. Huang / NCUE Fall 2015 35

Map to State Sequencer D PRE Q begin Initial: N← 0; A← 0; B←Y;

Map to State Sequencer D PRE Q begin Initial: N← 0; A← 0; B←Y; C← 0 D CLR Q Initial D CLR dummy Q B[0] 0 1 Accumulate: {C, A}←A + X 0 1 D CLR Accumulate Q Shift: {C, A, B}←{C, A, B}>>1; N ← N + 1 D CLR Q dummy 0 N=0 Shift D CLR Q 1 end ASM Chart HDL T. -C. Huang / NCUE Fall 2015 0 1 D CLR Q 36

Map to ALU begin Initial: N← 0; A← 0; B←Y; C← 0 dummy B[0]

Map to ALU begin Initial: N← 0; A← 0; B←Y; C← 0 dummy B[0] Shift 0 1 Accumulate: {C, A}←A + X 0 Accumulate + 1 0 0 Shift 1 Accumulate 1 1 0 0 Initial 0 1 Shift 0 Initial C 0 1 Initial A 0 1 B Shift: {C, A, B}←{C, A, B}>>1; N ← N + 1 1 + dummy Shift 0 end 1 0 N=0 1 0 Initial 0 1 N ASM Chart HDL T. -C. Huang / NCUE Fall 2015 37

Control Y D PRE X Q D CLR Q Initial D CLR Shift 0

Control Y D PRE X Q D CLR Q Initial D CLR Shift 0 + 1 0 Shift 1 Q Accumulate 0 1 Accumulate 1 0 0 Initial 1 0 1 Initial C D CLR Shift 0 0 0 1 Initial A 0 1 B Accumulate Q P D CLR Q + Shift D CLR 0 1 0 Q Initial 0 Ready 1 0 1 N 1 0 Rst Clk HDL D CLR T. -C. Huang / NCUE Fall 2015 Q = 38

Remapped to Coding Gate-Level Model for Sequencer module Sequencer(Rst, Clk, B, E, Initial, Accumulate,

Remapped to Coding Gate-Level Model for Sequencer module Sequencer(Rst, Clk, B, E, Initial, Accumulate, Shift, Ready); input Rst, Clk, B, E; output Initial, Accumulate, Shift, Ready; DFFp (Clk, Rst, 0, Q 1) DFFc (Clk, Rst, Q 1, Initial); DFFc (Clk, Rst, initial, Q 2); or (A, Q 2, E 0); mux (B, A, B 1, B 0); DFFc (Clk, Rst, B 1, Accumulate); or (Q 3, B 0, Accumulate) DFFc (Clk, Rst, Q 3, Shift); DFFc (Clk, Rst, Shift, Q 4); mux (E, Q 4, E 1, E 0); or (Q 5, Ready, E 1); DFFc (Clk, Rst, Q 5, Ready); endmodule HDL T. -C. Huang / NCUE Fall 2015 39

Remapped to Coding Minimized Counter-Based State Diagram Init ower. On or Sys. Rst Init

Remapped to Coding Minimized Counter-Based State Diagram Init ower. On or Sys. Rst Init HDL T. -C. Huang / NCUE Fall 2015 Acc N Shift 40

Counter-Based FSM Coding 2 -State-Cycle 2 n-Bit Sequential Multiplier module MUL(Rst, Clk, X, Y,

Counter-Based FSM Coding 2 -State-Cycle 2 n-Bit Sequential Multiplier module MUL(Rst, Clk, X, Y, P, Ready); Parameter n=3; input Rst, Clk; input [2^n-1: 0] X, Y; output [2*2^n-1: 0] P; output Ready; reg [n-1: 0] N reg C; reg [2*2^n-1: 0] P; reg [1: 0] State; parameter Init=2’b 00, Acc=2’b 01, Shift=2’b 10, Ready=2’b 11; always@(posedge Clk) if(Rst) begin N=0; State=Acc; C=0; P=Y; end else case(State) Acc: P=P+X; Shift: begin N=N+1; P={C, P[15: 1]}; C=0; State=N? Acc: Ready; end Ready: State=Ready; endcase endmodule HDL T. -C. Huang / NCUE Fall 2015 41

Counter-Based FSM Coding 1 -State-Cycle 2 n-Bit Sequential Multiplier module MUL(Rst, Clk, X, Y,

Counter-Based FSM Coding 1 -State-Cycle 2 n-Bit Sequential Multiplier module MUL(Rst, Clk, X, Y, P, Ready); input Clk, Rst; output Ready; input [7: 0] A, B; output [15: 0] P; reg [2: 0] N; reg Ready; wire [8: 0] S; wire [16: 0] Q; ; assign S = P[0]? (P[15: 8]+A): P[15: 8]; assign Q = {S, P[7: 0]}; always@(posedge Clk) if(Rst) begin N=0; P[15: 8]=0; P[7: 0]=B; Ready=0; end else if(!Ready) begin P = Q >> 1; if(N==7) Ready = 1; else N = N + 1; endmodule HDL T. -C. Huang / NCUE Fall 2015 42

N-Bit Multiplier assign P = A * B; Area Cost Parallel Multiplier (Combinatinal Circuit)

N-Bit Multiplier assign P = A * B; Area Cost Parallel Multiplier (Combinatinal Circuit) Partially Parallel N-bit Sequential Execution Time HDL T. -C. Huang / NCUE Fall 2015 43

Example: Greatest Common Divisor Ø Actually you have known the algorithm! Ø So, just

Example: Greatest Common Divisor Ø Actually you have known the algorithm! Ø So, just catch the regularities from observation. gcd( 91, 70 = gcd( 21, = gcd( 35, 21 = gcd( 14, = gcd( 21, 14 = gcd( 7, = gcd( 14, 14 0 HDL T. -C. Huang / NCUE Fall 2015 35 ) = 7 35 ) 21 ) 14 ) 7) 44

Example: Greatest Common Divisor Ø Then you can write a function in C. unsigned

Example: Greatest Common Divisor Ø Then you can write a function in C. unsigned int GCD(unsigned int A, insigned int B) { unsigned int C; } if(A<B) SWAP(A, B); while(B>0) { C = A % B; A = B; B = C; } return(A); / % can be simulated in Model. Sim (V 2 K). Need not be implemented in some synthesizers. void SWAP(unsigned int A, unsigned int B) { unsigned int C; C = A; A = B; Can be implemented in 1 B = C; } HDL T. -C. Huang / NCUE Fall 2015 cycle in hardware! 45

From C Straightforward to Verilog Example: GCD module GCD(Clk, Rst, A, B, D, Ready);

From C Straightforward to Verilog Example: GCD module GCD(Clk, Rst, A, B, D, Ready); input Clk, Rst; input [31: 0] A, B; output [31: 0] D; output Ready; parameter Init=0, Mod=1, Out=2; reg [1: 0] PS, NS; always@(posedge Clk) if(Rst) begin NS=Init; PS=Init; Ready=0; end else begin PS = NS; wire [31: 0] X, Y; always@* case(PS) Init: begin X = (A<B) ? B : A; Y = (A<B) ? A : B; NS = Mod; end Mod: begin X <= Y; Y <= X % Y; NS = Y ? Mod : Out; end Out: begin D = X; NS = Out; Ready=1; endcase endmodule HDL T. -C. Huang / NCUE Fall 2015 46

From C Straightforward to Verilog Example: Bubble Sorter int Sort(int M[], N) { int

From C Straightforward to Verilog Example: Bubble Sorter int Sort(int M[], N) { int a, n; for(n=N-1; n>0; n--) for(a=0; a<n; a++) Swap(M[a], M[a+1]); return(0); } Module M(Clk, WE, A, D); inout [31: 0] D; input [11: 0] A; input Clk, WE; reg [31: 0] M [0: 4095]; always@(posedge Clk) if(WE) M[A]=D; else D=M[A]; endmodule Init Go A=0 WE=0 N=4095 R C B=D A=A+1 WE=0 N=0 B<D: B=D; A=A+1; WE=0; If(A=N-1) N=N-1 B>=D: A=A-1; WE=1; W D=B; A=A+1; WE=1 If(A=N-1) N=N-1 WE Start Ready Sorter A D M Ok Ready=1 Clk HDL T. -C. Huang / NCUE Fall 2015 47

From C Straightforward to Verilog Example: Bubble Sorter module Sorter(Clk, Init, WE, A, D,

From C Straightforward to Verilog Example: Bubble Sorter module Sorter(Clk, Init, WE, A, D, Ready); input Clk, Init; inout [31: 0] D; output WE, Ready; output [11: 0] A; reg [2: 0] S; parameter Ok=0, Go=1, R=2, C=3, W=4; reg [11: 0] A; reg [31: 0] B, D; reg [11: 0] N; reg Ok; always@(posedge Clk) if(Init) begin N=4095; S=Go; end else case(S) Go: begin A=0; WE=0; S=R 1; end R: begin B=D; A=A+1; S=C; end C: if(B>D) begin A=A-1; WE=1; S=W; end else begin B=D; A=A+1; WE=0; if(A>=N-1) if(N>1) begin N=N-1; S=C; end else S=Ok; else S=Go; end W: begin D=B; A=A+1; WE=1; if(A>=N-1) begin if(N>1) begin N=N-1; S=Go; end else S=Ok; end Ok: Ready=1; endcase endmodule HDL T. -C. Huang / NCUE Fall 2015 48

From C Straightforward to Verilog Example: Prime Numbers void prime() { unsigned int M[100],

From C Straightforward to Verilog Example: Prime Numbers void prime() { unsigned int M[100], N=0, i, j, yes; M[N++]=2; for(i=3; i<9999; i+=2) { j=0; yes=1; while(M[j]*M[j]<i) if(i%M[j++]) {yes=0; break; } if(yes) M[N++]=i; } } HDL T. -C. Huang / NCUE Fall 2015 49

Exercises and Rethinks 1. List prime numbers < 9999. 2. Calculate the area of

Exercises and Rethinks 1. List prime numbers < 9999. 2. Calculate the area of a triangle with edges A, B, and C. 3. Quick Sort in Hardware. (How about parallel sorter? ) 4. Sequentially Fourier transform. (How about FFT? ) 5. Memory Controller (Why not SW? ) 6. Traffic Light Controller (Why not HW? ) 7. Elevator Controller (Why not HW? ) 8. FIR/IIR Filters (Why not FSM? Datapath) 9. PLC (programmable logic controller) (Why not HW? ) 10. JTAG TAPC (test access port controller) (Why not SW? ) HDL T. -C. Huang / NCUE Fall 2015 50