ArchitecturalLevel Synthesis Giovanni De Micheli Integrated Systems Laboratory
Architectural-Level Synthesis Giovanni De Micheli Integrated Systems Laboratory This presentation can be used for non-commercial purposes as long as this note and the copyright footers are not removed © Giovanni De Micheli – All rights reserved
Module 1 u Objectives s Motivation s Compiling language models into abstract models s Behavioral-level optimization and program-level transformations (c) Giovanni De Micheli 2
Synthesis u Transform behavioral into structural view u Architectural-level synthesis: s Architectural abstraction level s Determine macroscopic structure s Example: major building blocks u Logic-level synthesis: s Logic abstraction level s Determine microscopic structure s Example: logic gate interconnection (c) Giovanni De Micheli 3
Models and flows Schematics GDS 2 (c) Giovanni De Micheli HDL compilation translation Operations and dependencies (Data-flow & sequencing graphs) FSMs – Logic functions (State-diagrams & logic networks) Interconnected logic blocks (Logic networks) ARCHITECTURAL LEVEL BEHAVIORAL VIEW Esterel Statecharts HDL STRUCTURAL VIEW Verilog VHDL System. C ABSTRACT MODELS LOGIC LEVEL LANGUAGE MODELS Physical design (mask layout) 4
Example Differential equation solver diffeq { read ( x, y, u, dx, a ) ; repeat { xl = x + dx; ul = u – ( 3. x. u. dx ) – ( 3. y. dx ) ; yl = y + u. dx ; c=x<a; x = xl; u = ul; y = yl ; until ( c ); write ( y ) } (c) Giovanni De Micheli 5
Example * STEERING & MEMORY ALU * ALU (c) Giovanni De Micheli * ALU CONTROL UNIT STEERING & MEMORY CONTROL UNIT 6
Example Area 15 (2, 2) 13 12 (2, 1) 10 X(1, 2) 8 7 (1, 1) 5 Latency 1 (c) Giovanni De Micheli 2 3 4 5 6 7 8 7
Architectural-level synthesis motivation u Raise input abstraction level s Reduce specification of details s Extend designer base s Self-documenting design specifications s Ease modifications and extensions u Reduce design time u Explore and optimize macroscopic structure: s Series/parallel execution of operations (c) Giovanni De Micheli 8
Architectural-level synthesis u Translate HDL models into sequencing graphs u Behavioral-level optimization: s Optimize abstract models independently from the implementation parameters u Architectural synthesis and optimization: s Create macroscopic structure: t Data-path and control-unit Consider area and delay information of the implementation Giovanni De Micheli s (c) 9
Compilation and behavioral optimization u Software compilation: s Compile program into intermediate form s Optimize intermediate form s Generate target code for an architecture u Hardware compilation: s Compile HDL model into sequencing graph s Optimize sequencing graph s Generate gate-level interconnection for a cell library (c) Giovanni De Micheli 10
Hardware and software compilation front-end lex parse (c) Giovanni De Micheli Intermediate form back-end optimization codegen Intermediate form back-end behavioral optimization a-synthesis l-binding 11
Compilation u Front-end: s s a=p+q*r assignment = expression + identifier a Lexical and syntax analysis Parse-tree generation s Macro-expansion s Expansion of meta-variables expression * identifier p identifier q identifier r u Semantic analysis: s Data-flow and control-flow analysis s Type checking s Resolve arithmetic and relational operators (c) Giovanni De Micheli 12
Behavioral-level optimization u Semantic-preserving transformations aiming at simplifying the model u Applied to parse-trees or during their generation u Taxonomy: s Data-flow based transformations s Control-flow based transformations (c) Giovanni De Micheli 13
Tree-height reduction u Applied to arithmetic expressions u Goal: s Split into two-operand expressions to exploit hardware parallelism at best u Techniques: s Balance the expression tree s Exploit commutativity, associativity and distributivity (c) Giovanni De Micheli 15
Example of tree-height reduction using commutativity and associativity + + * * a b c d a d b c x = a + b * c + d → x = (a + d) + b * c (c) Giovanni De Micheli 16
Example of tree-height reduction using distributivity * + + * * * a b c d e a b c d a e x = a * (b * c * d + e) → x = a * b * c * d + a * e; (c) Giovanni De Micheli 17
Examples of propagation u Constant propagation a = 0; b = a + 1; c = 2 * b; a = 0; b = 1; c = 2; u Variable propagation: a = x; b = a + 1; c = 2 * x; a = x; b = a + 1; c = 2 * a; (c) Giovanni De Micheli 18
Sub-expression elimination u Logic expressions: s Performed by logic optimization s Kernel-based methods u Arithmetic expressions: s Search isomorphic patterns in the parse trees s Example: a = x + y; b = a +1; c = x + y a = x + y; b = a + 1; c = a; (c) Giovanni De Micheli 19
Examples of other transformations u Dead-code elimination: a = x; b = x + 1; c = 2 * x; a can be removed if not referenced u Operator-strength reduction: a = x 2, b = 3 * x; a = x * x; t = x << 1; b = x + t; u Code motion: for ( i = 1; i < 100) { data[i] = 3 * x * y * input[i] } t = 3 * x * y; for ( i = 1; i < 100) { data[i] = t * input[i] } (c) Giovanni De Micheli 20
Control-flow based transformations u Model expansion u Conditional expansion u Loop expansion (c) Giovanni De Micheli 21
Module 2 u Objectives s Architectural optimization s Scheduling, resource sharing, estimation (c) Giovanni De Micheli 25
Architectural synthesis and optimization u Synthesize macroscopic structure in terms of building-blocks u Explore area/performance trade-off: s maximize performance subject to area constraints s minimize area subject to performance constraints u Determine an optimal implementation u Create logic model for data-path and control (c) Giovanni De Micheli 26
Design space and objectives u Design space: s Set of all feasible implementations u Implementation parameters: s Area s Performance: t t t s Cycle-time Latency Throughput (for pipelined implementations) Power consumption (c) Giovanni De Micheli 27
Design evaluation space Area Max c Cy l ti e- e m Latency Max (c) Giovanni De Micheli 28
Hardware modeling u Circuit behavior: s Sequencing graphs u Building blocks: s Resources u Constraints: s Timing and resource usage (c) Giovanni De Micheli 29
Resources u Functional resources: s Perform operations on data s Example: arithmetic and logic blocks u Storage resources: s Store data s Example: memory and registers u Interface resources: s Example: busses and ports (c) Giovanni De Micheli 30
Resources and circuit families u Resource-dominated circuits s Area and performance depend on few, wellcharacterized blocks s Example: DSP circuits u Non resource-dominated circuits s Area and performance are strongly influenced by sparse logic, control and wiring s Example: some ASIC circuits (c) Giovanni De Micheli 31
Implementation constraints u Timing constraints: s Cycle-time s Latency of a set of operations s Time spacing between operation pairs u Resource constraints: s Resource usage (or allocation) s Partial binding (c) Giovanni De Micheli 32
Synthesis in the temporal domain u Scheduling: s Associate a start-time with each operation s Determine latency and parallelism of the implementation u Scheduled sequencing graph: s Sequencing graph with start-time annotation (c) Giovanni De Micheli 33
Example NOP 0 TIME 1 * 1 2 * * 3 TIME 2 * TIME 3 TIME 4 * - 6 7 + 8 9 + < 10 11 4 - 5 NOP (c) Giovanni De Micheli * n 34
Example 2 NOP 0 TIME 1 * 1 2 * + 3 TIME 2 * * TIME 3 TIME 4 - 4 * - 6 < 7 * 5 + NOP (c) Giovanni De Micheli 10 11 8 9 n 35
Synthesis in the spatial domain u Binding: s Associate a resource with each operation with the same type s Determine the area of the implementation u Sharing: s Bind a resource to more than one operation s Operations must not execute concurrently u Bound sequencing graph: s Sequencing graph with resource annotation (c) Giovanni De Micheli 36
Example NOP 0 (1, 1) TIME 1 * (1, 2) 1 (1, 3) 2 * * 3 TIME 2 TIME 3 * (2, 1) TIME 4 * - * 7 + 8 9 (2, 2) + < 10 11 4 - 5 NOP (c) Giovanni De Micheli (1, 4) 6 n 37
Estimation u Resource-dominated circuits s Area = sum of the area of the resources bound to the operations t s Determined by binding Latency = start time of the sink operation (minus start time of the source operation) t Determined by scheduling u Non resource-dominated circuits s Area also affected by: t s Registers, steering logic, wiring and control Cycle-time also affected by: t Steering logic, wiring and (possibly) control (c) Giovanni De Micheli 38
Approaches to architectural optimization u Multiple-criteria optimization problem: s Area, latency, cycle-time u Determine Pareto optimal points: s Implementations such that no other has all parameters with inferior values u Draw trade-off curves: s Discontinuous and highly nonlinear (c) Giovanni De Micheli 39
Area-latency trade-off u Rationale: s Cycle-time dictated by system constraints u Resource-dominated circuits: s Area is determined by resource usage u Approaches: s Schedule for minimum latency under resource usage constraints s Schedule for minimum resource usage under latency constraints t for varying cycle-time constraints (c) Giovanni De Micheli 40
Area/latency trade-off (2, 2) (2, 1) Area 20 X(1, 2) (3, 2) 18 17 (3, 1) (1, 1) 15 13 12 cl Cy (2, 1) 10 e m ti e- 40 8 7 5 Latency 30 1 (c) Giovanni De Micheli 2 3 4 5 6 7 8 41
Summary u Behavioral optimization: s Create abstract models from HDL models s Optimize models without considering implementation parameters u Architectural synthesis and optimization s Consider resource parameters s Multiple-criteria optimization problem: t area, latency, cycle-time (c) Giovanni De Micheli 42
- Slides: 38