ECE 526 Network Processing Systems Design Microengine Programming

  • Slides: 22
Download presentation
ECE 526 – Network Processing Systems Design Microengine Programming Chapter 23: D. E. Comer

ECE 526 – Network Processing Systems Design Microengine Programming Chapter 23: D. E. Comer

Overview • Lab 3: packet forwarding and counting on IXP 2400 ─ Any problems

Overview • Lab 3: packet forwarding and counting on IXP 2400 ─ Any problems with Part I and Part II • Microcode programming • Lab 3 part III Ning Weng ECE 526 2

Microengine Assembler • Assembly languages matches the underlying hardware ─ Intel developed “microengine assembly

Microengine Assembler • Assembly languages matches the underlying hardware ─ Intel developed “microengine assembly language” • Assembly is difficult to program directly ─ Assembler supports higher-level statements • High-level mechanisms: ─ ─ Assembler directives Symbolic register names and automated register allocation Macro preprocessor Pre-defined macros for common control structures • Balance between low-level and higher-level programming Ning Weng ECE 526 3

Assembly Language Syntax • Instructions: label: operator operands token ─ ─ Operands and token

Assembly Language Syntax • Instructions: label: operator operands token ─ ─ Operands and token are optional Label: symbolic name as target for branch Operator: single microengine instruction or high-level command Operands and token: depend on operator • Comments: ─ ─ C-style: /* comment */ C++-style: // comment ASM-style: ; comment Benefit of ASM style: remain with code after preprocessing • Directives: ─ Start with “. ” Ning Weng ECE 526 4

Operand Syntax • Example: ALU instruction alu [dst, src 1, op, src 2] ─

Operand Syntax • Example: ALU instruction alu [dst, src 1, op, src 2] ─ dst: destination for result ─ src 1 and src 2: source values ─ op: operation to be performed • Notes: ─ ─ Destination register cannot be read-only (e. g. , read transf. reg. ) If two source regs are used, they must come from different banks Immediate values can be used “--” indicates non-existing operand (e. g. , source 2 for unary operation or destination) Ning Weng ECE 526 5

ALU Operators Ning Weng ECE 526 6

ALU Operators Ning Weng ECE 526 6

Other Operators • ALU shift/rotate: ─ alu_shf [dst, src 1, op, src 2, shift]

Other Operators • ALU shift/rotate: ─ alu_shf [dst, src 1, op, src 2, shift] ─ shift specifies right or left shift or rotate (e. g. , <<12, >>rot 3) • Memory accesses: ─ sram [direction, xfer_reg, addr 1, addr 2, count] ─ direction is “read” or “write” ─ addr 1 and addr 2 are used for base+offset and scaling • Immediate: ─ ─ immed [dst, ival, rot] Immediate has upper 16 bit all 0 or all 1 Rotation is “ 0”, “<<8”, or “<<16” Also direct access to individual bytes/words: immed_b 2, immed_w 1 Ning Weng ECE 526 7

Symbolic Register Names • Assembler supports automatic register allocation ─ Either entirely manual or

Symbolic Register Names • Assembler supports automatic register allocation ─ Either entirely manual or automatic – no mixture possible • Symbolic register names: ─. areg loopindex 5 ─ Assigns the symbolic name “loopindex” to register 5 in bank A • Other directives: Ning Weng ECE 526 8

Register Types and Syntax • Register names with relative and absolute addressing: • Note:

Register Types and Syntax • Register names with relative and absolute addressing: • Note: read and write transfer registers are separate ─ You cannot read a value after you have written it to a xfer reg • Also: some instruction sequences impossible: ─ Z <- Q + R ─ Y <- R + S ─ X <- Q + S Ning Weng ECE 526 9

Scoping • Scopes define regions where variable names are valid ─. local directive: •

Scoping • Scopes define regions where variable names are valid ─. local directive: • Outside scope registers • can be reused • Scopes can be nested ─ Names are “shadowed” Ning Weng ECE 526 10

Macro Preprocessor • Preprocessor functionality: ─ ─ ─ File inclusion Symbolic constant substitution Conditional

Macro Preprocessor • Preprocessor functionality: ─ ─ ─ File inclusion Symbolic constant substitution Conditional assembly Parameterized macro expansion Arithmetic expression evaluation Iterative generation of code • Macro definition ─ #macro name [parameter 1, parameter 2, …] lines of text #endm Ning Weng ECE 526 11

Macro Example • Example for a=b+c+5: ─ #macro add 5 [a, b, c]. local

Macro Example • Example for a=b+c+5: ─ #macro add 5 [a, b, c]. local tmp alu[tmp, c, +, 5] alu[a, b, +, tmp]. endlocal #endm • Problems when tmp variable is overloaded: ─ add 5[x, tmp, y] ─ Why? • One has to be careful with marcos! Ning Weng ECE 526 12

Preprocessor Statements Ning Weng ECE 526 13

Preprocessor Statements Ning Weng ECE 526 13

Structured Programming Directives • Structured directives are similar to control statements: Ning Weng ECE

Structured Programming Directives • Structured directives are similar to control statements: Ning Weng ECE 526 14

Example • If statement with structured directives: ─. if ( conditional_expression ) /* block

Example • If statement with structured directives: ─. if ( conditional_expression ) /* block of microcode */. else /* block of microcode */. endif • While statement: ─. while ( conditional_expression ) /* block of microcode */. endw • Very useful and less error-prone than hand-coding Ning Weng ECE 526 15

Conditional Expressions • Conditional expressions may have C-language operators ─ ─ Integer comparison: <,

Conditional Expressions • Conditional expressions may have C-language operators ─ ─ Integer comparison: <, >, <=, >=, ==, != Shift operator: <<, >> Logic operators: &&, || Parenthesis: (, ) • Additional test operators Ning Weng ECE 526 16

Context Switches • Instructions that cause context switches: ─ ctx_arb instruction ─ Reference instruction

Context Switches • Instructions that cause context switches: ─ ctx_arb instruction ─ Reference instruction • ctx_arb instruction: ─ ─ One argument that specifies how to handle context switch voluntary signal_event – waits for signal kill – terminates thread permanently • Reference instruction to memory, hash, etc. ─ One argument ─ ctx_swap – thread surrenders control until operation completed ─ sig_done – thread continues and is signaled completion Ning Weng ECE 526 17

Indirect References • Sometimes memory addresses are not known at compile time ─ Indirect

Indirect References • Sometimes memory addresses are not known at compile time ─ Indirect references use result of ALU instruction to modify immediately following reference ─ “Unlike the conventional use of the term [indirect reference], Intel’s indirect reference mechanism does not follow pointers; the terminology is confusing at best. ” ☺ • Indirect reference can modify: ─ ─ Microengine associated with memory reference First transfer register in a block that will receive result The count of words of memory to transfer The thread ID of the hardware thread executing the instruction • Bit patterns specifying operation and parameter must be loaded into ALU ─ Uses operation without destination: alu_shf[--, b, 0 x 13, <<16] ─ Reference: scratch[read, $reg 0, addr 1, addr 2, 0], indirect_ref Ning Weng ECE 526 18

Transfer Registers • Memory transfers need contiguous registers ─ Specified with. xfer_order ─. local

Transfer Registers • Memory transfers need contiguous registers ─ Specified with. xfer_order ─. local $reg 1 $ref 2 $ref 3 $ref 4. xfer_order $reg 1 $reg 2 $reg 3 $reg 4 • Library macros for transfer register allocation ─ Allocations: xbuf_alloc[] ─ Deallocation: xbuf_free[] ─ Example: xbuf_alloc[$$buf, 4] allocates $$buf 0, …, $$buf 3 • Allocation is based on 32 -bit chunks ─ Transfer of 2 SDRAM units requires 4 transfer registers Ning Weng ECE 526 19

Lab 3: Part III • type = _buf_byte_extract((UINT*)p_pkt, PPP_IPV 4_TCP_PORT _OFFSET, PPP_IPV 4_TCP_DPORT_LEN, mem.

Lab 3: Part III • type = _buf_byte_extract((UINT*)p_pkt, PPP_IPV 4_TCP_PORT _OFFSET, PPP_IPV 4_TCP_DPORT_LEN, mem. Type); // _buf_byte_extract: extract a numeric byte field from buffer. // in_src pointer to the buffer data that contains the field. // in_field_start byte offset of field to be extracted. // in_bytes_num length of field in bytes. ─ #define PPP_IPV 4_TCP_DPORT_LEN ─ #define PPP_IPV 4_TCP_PORT_OFFSET Ning Weng ECE 526 2 0 x 18 20

Lab 3: part III • if (type == PPP_IPV 4_TCP_WEB) { sram_incr((volatile void __declspec(sram)

Lab 3: part III • if (type == PPP_IPV 4_TCP_WEB) { sram_incr((volatile void __declspec(sram) *)(COUNT_IPV 4_TCP_WEB_SRAM_ADDR)); dl. Next. Block = BID_IPV 4; return; } ─ #define PPP_IPV 4_TCP_WEB 0 x 0050 ─ #define COUNT_IPV 4_TCP_WEB_SRAM_ADDR 0 x 40300208 • sram_incr() ─ Description: this function increments the longword at address by one. ─ Arguments: address Address to read from. ─ Reference: Microengine C compiler language Ning Weng ECE 526 21

Summary • Assembly help performance but difficult to program directly • High-level mechanisms: ─

Summary • Assembly help performance but difficult to program directly • High-level mechanisms: ─ ─ Assembler directives Symbolic register names and automated register allocation Macro preprocessor Pre-defined macros for common control structures • Balance between low-level and higher-level programming Ning Weng ECE 526 22