The Microprocessor is no more General Purpose Design







































- Slides: 39
The Microprocessor is no more General Purpose
Design Gap
Problems with Fine Grained Approach FPGAs • Area in-efficient – Percentage of chip area for wiring far too high • Too slow – Unavoidable critical paths too long • Routing and Placement is very complex
Problems with Fine Grained FPGAs
Coarse Grained Reconfigurable computing • Uses reconfigurable arrays with path-widths greater than 1 bit • More area-efficient • Massive reduction in configuration memory and configuration time • Drastic reduction in complexity of Placement & Routing
Coarse Grained Architectures Classification • Mesh-based • Linear Arrays based • Cross-bar based
Mesh Based Architectures • Arranges PEs in a 2 -D array • Encourages nearest neighbor links between adjacent PEs • Eg. Kress. Array, Matrix, RAW, CHESS
Matrix – Mesh based Architecture
Matrix – Mesh Based Architecture
Architectures based on Linear Arrays • Aimed at mapping pipelines on linear arrays • If pipeline has forks longer lines spanning whole or part of the array are used • Eg. Ra. Pi. D, Pipe. Rench
Pipe. Rench – Linear Array based architecture
Pipe. Rench – Linear Array Based Architecture
Cross-bar based Architectures • Communication Network is easy to route • Uses restricted cross-bars with hierarchical interconnect to save area • Eg. PADDI-1, PADDI-2, Pleiades
PADDI-2 – Cross-bar based architecture
PADDI-2 Cross-bar based Architecture
Coarse Grained Architectures
EGRA • Architectural template to enable design space exploration • Execute expressions as opposed to operations • Supports heterogeneous cells and various memory interfaces
EGRA
Evolution of fine grained and coarse grained architectures
EGRA – at Cell Level
Architectural Exploration
Architectural exploration
EGRA vs CGRA vs FPGA
EGRA – at array level • Organized as a mesh of cells of three types – RACs – Memories – Multipliers • Cells are connected using both nearest neighbor and horizontal-vertical buses • Each cell has a I/O interface, context memory and core
Control Unit
EGRA Operation • DMA mode – Used to transfer data in bursts to EGRA – To program cells and to read/write from scratchpad memories • Execution mode – Control unit orchestrates data flow between cells
EGRA – at array level
Experimental Results
Experimental Results
Experimental Results
EGRA Memory Interface • Data register at the output of computational cells • Memory cells can be scattered around in the array • A scratchpad memory outside reconfigurable mesh
Architectural exploration - Area
Architectural exploration - Delay
MORA
The reconfigurable Cell
Operating modes of RC
Interconnection Topology • Hierarchical – Level 1 used within 4 x 4 quadrant to provide nearest neighbor connectivity – Interleaved Horizontal and Vertical connectivity of length two – Each RC can receive data from at most two other RCs and send data to at-most four other RCs – Data and control across quadrants is guaranteed over Level 2 interconnection
Interconnection Topology
Computational Strategies • Temporal computational load balancing • Spatial computational load balancing