Parameterized SystemsonaChip Frank Vahid Tony Givargis Roman Lysecky
- Slides: 42
Parameterized Systems-on-a-Chip Frank Vahid Tony Givargis, Roman Lysecky, Leslie Tauro, Susan Cotterell Department of Computer Science and Engineering University of California, Riverside Supported by: NSF, NEC, DAC scholarship The Dalton Project
Outline • Introduction • Parameterized Systems-on-a-Chip • Exploring Parameter Configurations • Conclusions 2
Introduction • Advent of system-on-a-chip Microproc. IC Memory IC Microprocessor core (aka “IP”) IC Peripher. IC FPGA IC Peripheral core Board Introduction 3
System-on-a-chip (SOC) Introduction 4
The Productivity Gap [ITRS 99] 5
Programmable Platforms Microprocessor Cache Memory (ITRS 99) DMA Bridge FPGA System bus Peripheral bus Programmable Peripheral Platform • Pre-fabricated IC, synthesizable HDL, or both – “reference designs” (VLSI), “silicon platforms” (Philips), “fig chips” (Vahid/Givargis 99) Introduction 6
Targeted to Embedded Systems • May drive future architecture design [Patterson 98] • Varied power/performance/size constraints – Programmable platforms must adapt Introduction 7
Adapting platforms to constraints • One solution: Architectural Parameters Application 1 Microprocessor main() while (…) { Cache Memory DMA Bridge FPGA System bus Application 2 … main() … while(…) { …… } } Cache Peripheral bus Programmable Peripheral Platform Introduction 8
Related work • Microcontrollers • VLSI’s Velocity • Pleiades project [Rabaey 97] • Microprocessor + FPGA • Philips’ Y-Chart approach Architecture Applications Mapping Analysis Our focus Introduction Numbers 9
Outline • Introduction • Parameterized Systems-on-a-Chip • Exploring Parameter Configurations • Conclusions 10
Basic parameters -- cache Microprocessor Cache Memory DMA Bridge FPGA System bus Peripheral bus Programmable Peripheral Platform Parameterized Systems-on-a-chip 11
Basic parameters -- cache Tag • Line Size V T Index D V T Offset D • Associativity • Cache Size == == Mux Data Parameterized Systems-on-a-chip 12
Basic parameters -- bus Microprocessor Cache Memory DMA Bridge FPGA System bus Peripheral bus Programmable Peripheral Platform Parameterized Systems-on-a-chip 13
Basic parameters -- Bus C 1 Change Bus Width [Givargis 98] Bus Mux Demux C 2 C 1 > C 2 Parameterized Systems-on-a-chip 14
Basic parameters -- Bus Encoder Decoder Parameterized Systems-on-a-chip 0 1 0 1 1 0 0 1 1 Bus-Invert Encoding 1 0 0 1 1 0 Hamming Dist = 3 0 1 0 1 1 Hamming Dist = 6 Binary Encoding invert_ctrl Encode data to reduce switching (Bus Invert) [Stan 95] invert_ctrl 15
Parameter definitions • Parameter – An architectural feature that can be varied, with a small set of possible values, without changing the application’s essential functionality. • Configuration – A selection of a particular value for every architecture parameter • Static vs. dynamic parameter – Static: Value is set before fabricating the IC. – Dynamic: Value is set after fabricating the IC. Parameterized Systems-on-a-chip 16
Potential tradeoffs experiment Microprocessor I-cache D-cache Memory DMA [ICCAD 99] Bridge System bus Peripheral bus Parameters Possible values Size Peripheral 32 k, 16 k, 8 k, 4 k, 2 k, 1 k, 512, 256, 128 FPGA I-cache Line 8, 16, 32 Associativity 2, 4, 8 Size 32 k, 16 k, 8 k, 4 k, 2 k, 1 k, 512, 256, 128 D-cache Line 8, 16, 32 Associativity 2, 4, 8 Data bus width 4, 8, 16, 32 Mp-c bus Data bus invert on or off Data bus width 4, 8, 16, 32 Sys. bus Data bus invert on or off Parameterized Systems-on-a-chip 17
Potential tradeoffs experiment • Cache: Dinero [Edler, Hill] C Program Instr. Set Micro. Simulator processor [ICCAD 99] • ISS: [Tiwari 96] Cache Simulator Memory Simulator Power Bus simulator Total power Parameterized Systems-on-a-chip 18
Potential tradeoffs experiment • Computed power for all 45, 568 configurations – For each of four C applications – Used microprocessor, cache, and bus simulators (1 wk CPU) Tradeoff between performance and power • X-axis: execution time (sec) • Y-axis: power (watt) Parameterized Systems-on-a-chip 19
Potential tradeoffs experiment Bus: 32 -1/32 -0 I: 16 k, 4, 4 D: 16 k, 4, 4. 086 sec, 43. 6 W, 20 k. G Bus: 16 -1/32 -1 I: 16 k, 8, 16 D: 32 k, 8, 8. 389 sec, 11. 4 W, 21 k. G Bus: 8 -1/32 -1 I: 32 k, 8, 8 D: 16 k, 8, 16. 995 sec, 3. 4 W, 30 K Narrower bus required a larger cache size Parameterized Systems-on-a-chip 20
Potential tradeoffs experiment • Performance varied by 11 x • Power varied by 13 x • Area varied by 1 x • Energy consumption varied by 2 x Parameterized Systems-on-a-chip 21
Potential tradeoffs experiment Bus: 32 -1/32 -1 I: 1 k, 4, 4 D: 512, 4, 8 2 ms, . 19 W, 15 k. G Bus: 16 -1/32 -1 I: 1 k, 4, 4 D: 512, 4, 8 3 ms, . 07 W, 17 k. G Bus: 8 -1/4 -0, I: 1 k, 2, 4 D: 512, 2, 4 5 ms, . 02 W, 18 k. G Parameterized Systems-on-a-chip 22
Potential tradeoffs experiment • Performance varied by 2. 5 x • Power varied by 9. 5 x • Area varied by 1 x • Energy consumption varied by 4 x Parameterized Systems-on-a-chip 23
Potential tradeoffs experiment • How much variation in total system power and performance can we obtain just by varying the cache and bus parameters? – 9 to 14 x improvement in power/performance • How interdependent are these two types of parameters? – fixing cache param. values, then selecting bus param. values results in non-optimal solutions Parameterized Systems-on-a-chip 24
Many more parameters possible • Some examples include: – – – Code compression Address bus encoding Multiple levels of memory hierarchy CPU parameters (e. g. , voltage scale, DP width) Peripheral core parameters (our current focus) Fertile research area • Can yield even larger tradeoffs if we: – Create parameter-aware compiler – Adapt OS? Parameterized Systems-on-a-chip 25
Outline • Introduction • Parameterized Systems-on-a-Chip • Exploring Parameter Configurations • Conclusions 26
Evaluation by gate-level simulation Reconfigure Microprocessor Cache Memory DMA Bridge FPGA System bus Peripheral bus Programmable Peripheral Platform Total power HDL simulation HDL synthesis • Capture each core in HDL, synthesize, simulate • Hours (often tens) per configuration Exploring Parameter Configurations 27
Trace Generator Micro. Instr. Set processor Simulator Cache Simulator • Minutes-per-configuration • Contrast with hours-per-conf. Memory Simulator r Exploring Parameter Configurations Bridge Simulator P o we Total power Power Bus Peripheral simulatorbus DMA Simulator Peripheral Simulator OO models C Program Power Reconfigure Evaluation by system-level simulation 28
Evaluation by trace-simulation C Program Instr. trace Address trace Cache trace Simulator • Seconds-per-configuration Memory trace Simulator Exploring Parameter Configurations Po Total power DMA trace Simulator w e r Bus trace simulator Power Bus trace Power Reconfigure – Get traces from small # of system simulation Bridge trace Simulator Instr. traces Peripheral trace Simulator OO non-fct. models Instr. trace Simulator • Note that the cache simulator is non-functional • Same approach for others Trace Generator 29
Evaluation by trace-analysis C Program Instr. stats. Cache trace analyzer DMA trace analyzer Po w e r Exploring Parameter Configurations Power Bus trace simulator Total power • Milliseconds-per-configuration Memory trace analyzer Power Bus stats. Power Reconfigure Address stats. – statistically-characterize traces – Still only small # of system simulations Bridge trace analyzer Instr. stats. Peripheral trace analyzer Equations Instr. trace analyzer • Further speedup -- Trace Generator 30
Trace-analysis approach for cache • Given a trace of memory refs • Cache parameters • Size (S) • Line/block-size (L) • Associativity (A) • Compute # of misses (N) Size (S) Exploring Parameter Configurations 31
Trace-analysis approach for cache Exploring Parameter Configurations 32
Trace-analysis approach for cache • Capture improvements obtainable by: – changing line-size at small/large values of cache-size – changing associativity at small/large values of cache-size Exploring Parameter Configurations 33
Trace-analysis approach for bus capacitance Num transfers per item Random data Exploring Parameter Configurations Bus width Items/second 34
Trace-analysis approach for bus • Bus equation: • m items/second (denotes the traffic N on the bus) • n bits/item • k bit wide bus • bus-invert encoding • random data assumption Exploring Parameter Configurations 35
Trace-analysis experiments • Cache parameters – size: 128, 256, 512, 1 k, 2 k, 4 k, 8 k, 16 k, 32 k – assoc: 2, 4, 8 – line: 8, 16, 32 • Bus Parameters – width: 4, 8, 16, 32 CPU Bus A I-Cache D-Cache Memory Bridge – code: binary/bus-invert • Analyzed 45 K sets exhaustively for each of 4 examples. Exploring Parameter Configurations Bus B Peripheral Bus Peripheral 1 Peripheral 2 Peripheral n 36
Experiment Results • Diesel application’s performance • Blue (light-gray) is system-simulation-based • Red (dark-gray) is trace-analysis-based 4% error 320 x faster Exploring Parameter Configurations 37
Experiment Results • Diesel application’s energy consumption • Blue (light-gray) is obtained using full simulation • Red (dark-gray) is obtained using our equations 2% error 420 x faster Exploring Parameter Configurations 38
Experiment Results • CKey application’s performance • Blue (light-gray) is obtained using full simulation • Red (dark-gray) is obtained using our equations 8% error 125 x faster Exploring Parameter Configurations 39
Experiment Results • CKey application’s energy consumption • Blue (light-gray) is obtained using full simulation • Red (dark-gray) is obtained using our equations 3 % error 125 x faster Exploring Parameter Configurations 40
Experiment Results • 125 - 400 x speedup • 1 -18% absolute error (power & performance) Time (hours) Power Error (%) • 2% average power error Exploring Parameter Configurations 41
Conclusions • Parameters can improve usefulness of programmable platforms – by adapting platform to particular application and to power/performance constraints • Good tradeoff range even for basic parameters • Fast and accurate evaluation seems possible • Future research – Fast evaluation techniques for general cores – More parameters for SOC’s, static and dynamic – Couple with parameter-aware compilers – Dynamic re-configuration (adapt) 42
- Tony givargis
- Oracle apex parameterized report
- Parameterized abstract data types
- Raw use of parameterized class 'linkedhashmap'
- Difference between default and parameterized constructor
- Java thread states
- Naming encapsulations
- Parameterized stream manipulator
- Parameterized abstract data types
- Parameterized curve
- Parameterized abstract data types
- Dsp1
- Vahid kazemi md
- Vahid e5
- Steiner
- Dr vahid khajoee
- Vahid akhavan
- Cyrus vahid
- Vahid tabatabaee
- Vahid hejazi
- Www.pichakgroup.com
- Sun sülaləsi
- Vahid tabatabaee
- Frank william abagnale jr. young
- Empire
- Roman republic vs roman empire
- Ontwikkelactiviteiten
- Tony roma's halal atau tidak
- Tony and sue are launching yard darts
- Tony kusalik
- University of oxford webmail
- Tony margeta
- Tony a data analyst for a major casino
- Tony ludlow
- Every student succeeds act california
- Dieter appelt autoportrait au miroir
- Tony gillman
- Tony bridgwater
- Tony buzan biography
- Tony oursler eyes
- Tony morrissey
- Tony robbins franchise
- Tony robbins myers briggs