PowerAware Microprocessors Emily Chan COMP 4211 Advance Computer
- Slides: 58
Power-Aware Microprocessors Emily Chan COMP 4211 Advance Computer Architecture
Paper Yu Bai and R. Iris Bahar. A Dynamically Reconfigurable Mixed In. Order/Out-of-Order Issue Queue for Power -Aware Microprocessors. 2/24/2021 COMP 4211 Advance Computer Architecture 2
Outline Introduction l Focus of the paper l Overview of Approaches Taken l Related Work Done l Implementations l Experimental Results l Conclusion l 2/24/2021 COMP 4211 Advance Computer Architecture 3
WHY? 2/24/2021 COMP 4211 Advance Computer Architecture 4
WHY ? !! 2/24/2021 COMP 4211 Advance Computer Architecture 5
Two Major Issues l Battery Life – Mobile phones, Laptops and any other portable equipments. l Cooling Package – When Pentium “N” comes out, you may have to keep it in a freezer. 2/24/2021 COMP 4211 Advance Computer Architecture 6
What is the problem? l Different applications may vary widely in: l Degree of instruction-level parallelism (ILP) l Branch behavior l Memory access behavior Datapath resources not optimally utilized by all applications HOWEVER, Still consuming power!!!! 2/24/2021 COMP 4211 Advance Computer Architecture 7
How can we solve the problem? Golden Rule: A good design strategy should be flexible enough to dynamically reconfigure available resources according to the program’s needs. 2/24/2021 COMP 4211 Advance Computer Architecture 8
Outline Introduction l Focus of the paper l Overview of Approaches Taken l Related Work Done l Implementations l Experimental Results l Conclusion l 2/24/2021 COMP 4211 Advance Computer Architecture 9
Focus of the paper l “Reconfigurability” of the issue queue in out-of-order superscalar processors a large source of the total power dissipation l Believe it or Not: For Alpha 21264, 46% of the total power goes to the issue logic! 2/24/2021 COMP 4211 Advance Computer Architecture 10
Outline Introduction l Focus of the paper l Overview of Approaches Taken l Related Work Done l Implementations l Experimental Results l Conclusion l 2/24/2021 COMP 4211 Advance Computer Architecture 11
Overview of Approaches Taken Partition issue queue into several sets (FIFOs) -- Why? l Only instructions at the head of each FIFO are visible to the request and selection / arbitration logic -- Why? l Each FIFO issues in-order though the overall issue logic is still out-of-order -- What are the benefits? l 2/24/2021 COMP 4211 Advance Computer Architecture 12
Outline Introduction l Focus of the paper l Overview of Approaches Taken l Related Work Done l Implementations l Experimental Results l Conclusion l 2/24/2021 COMP 4211 Advance Computer Architecture 13
Related Work Done Hardware dynamically monitors performance disabling part of integer and/or floating point pipelines l Varying the instruction issue width to allow disabling of a cluster of function units l Dynamically reducing the number of active entries in the instruction window l 2/24/2021 COMP 4211 Advance Computer Architecture 14
Drawbacks No way to tell whether an instruction is ready to be issued or not and all instructions are visible to the selection and wake up logic power inefficient l Dynamically adjusting the issue queue size narrows the scope of instructions available for exposing ILP l 2/24/2021 COMP 4211 Advance Computer Architecture 15
Palacharla’s approach Uses FIFOs as well l Simplifies wake up and selection logic which puts chains of dependent instructions into FIFO buffers l Issues instructions from multiple buffers in parallel l 2/24/2021 COMP 4211 Advance Computer Architecture 16
Palacharla’s Drawbacks l Uses a single fixed-sized data structure not always beneficial for different applications Why is data structure such an important issue? 2/24/2021 COMP 4211 Advance Computer Architecture 17
2/24/2021 COMP 4211 Advance Computer Architecture 18
Performance Analysis l Use a 1 -entry FIFO configuration as a base case, on average: l l l 2 -entry FIFO 3% drop 4 -entry FIFO 14% drop 8 -entry FIFO 30% drop 64 -entry (a single FIFO) 84% drop For li, performance improves up to 4 -entry FIFO avoids executing wrong path instructions effectively 2/24/2021 COMP 4211 Advance Computer Architecture 19
Outline Introduction l Focus of the paper l Overview of Approaches Taken l Related Work Done l Implementations l Experimental Results l Conclusion l 2/24/2021 COMP 4211 Advance Computer Architecture 20
Implementations l Scheme # 1 Completely disable some under-utilized FIFOs in the issue queue according to feedback from performance monitor (hardware) Pro: By completely disabling a FIFO any signals associated disabled more power savings Con: Shrinking the overall size of the issue queue Limit exposure to potential ILP not suitable for Floating Point execution 2/24/2021 COMP 4211 Advance Computer Architecture 21
Implementations l Scheme # 2 l l l vary the number and size of the FIFOs simultaneously according to feedback from performance monitor size of FIFOs increases while the number of FIFOs decreases retain same number of issue queue entries at all times but the queue appears to be smaller Pro: more flexibility in exposing potential ILP Con: entries are only made invisible associated signals still enabled less power savings 2/24/2021 COMP 4211 Advance Computer Architecture 22
Implementations l When performance is suffering a large fraction of the issue queue is turned back on (Scheme # 1) or made visible (Scheme # 2) to the request and selection logic 2/24/2021 COMP 4211 Advance Computer Architecture 23
Pipeline Organization l Up to 6 instructions each cycle 2/24/2021 COMP 4211 Advance Computer Architecture 24
Two Major Components l Issue queue l l a set of reconfigurable FIFOs insert at the tail; issue from head of a FIFO only heads of FIFOs are visible Hardware performance monitors determine optimal issue queue configuration l statistics gathered over a fixed interval of cycles called a cycle window (1024 cycles) l 2/24/2021 COMP 4211 Advance Computer Architecture 25
Issue Queue Design l Scheme # 1 2/24/2021 COMP 4211 Advance Computer Architecture 26
Scheme # 1 Design When under-utilized, disable a FIFO l FIFO must be drained of all valid entries before being disabled l Reduces number of instructions bidding for an issue slot power saving in the wakeup and selection logic! l Not having to update the ready status of the disabled instruction entries power saving! l 2/24/2021 COMP 4211 Advance Computer Architecture 27
Issue Queue Design l Scheme # 2 2/24/2021 COMP 4211 Advance Computer Architecture 28
Scheme # 2 Design Vary size and number of FIFOs simultaneously l Assumed no cycle overhead in changing from one configuration to another since each instruction has a set of arbiter enable signals indicating its arbiter assignment l Arbiter signals are disabled except for heads of FIFO power saving! l Power savings only when reduced activities in the request and selection logic l 2/24/2021 COMP 4211 Advance Computer Architecture 29
Allocations of instructions into FIFOs Important that most of the ready instructions are at the heads of FIFOs use a dependency-based strategy l l Attempt to place an instruction in the same FIFO as one or both of its source dependencies 2/24/2021 COMP 4211 Advance Computer Architecture 30
Dependency-based Strategy l l If ready new empty FIFO if no empty FIFO then !!! If one pending operand steer to the same FIFO as the producer if possible if fail, try a new empty FIFO if no empty FIFO then !!!! 2/24/2021 COMP 4211 Advance Computer Architecture 31
Dependency-based Strategy l If two pending operands implement a Last Operand Predictor (LOP) to predict which of two operands will become available later try the late arrived producer first if fail, try the other producer if fail again, try a new empty FIFO if no empty FIFO then !!!! 2/24/2021 COMP 4211 Advance Computer Architecture 32
Hardware Performance Monitors At the end of each cycle window, determine which operating mode next l A combination of different monitoring techniques used better control l 2/24/2021 COMP 4211 Advance Computer Architecture 33
Monitoring Techniques l Monitoring IPC l l low IPC disable / hide part of the issue queue and enter low-power mode (LPM) Detecting variations in IPC l 2/24/2021 if issue and commit rates vary significantly a high branch misprediction decrease the number of FIFOs COMP 4211 Advance Computer Architecture 34
Monitoring Techniques l Performance degradation l l drop in IPC between two cycle windows exceeds a threshold value back to higher power mode Monitoring ready instructions too many stalls increase the number of FIFOs l very little stalls decrease the number of FIFOs l 2/24/2021 COMP 4211 Advance Computer Architecture 35
Monitoring Techniques l Issue queue usage l l low occupancy reduce the number of FIFOs Non-Critical Instructions if no instruction is placed behind a ready instruction by the time it is removed from the queue non-critical instruction l delaying such ready instruction won’t hurt l too many non-critical instructions reduce the number of FIFOs l 2/24/2021 COMP 4211 Advance Computer Architecture 36
Power Estimations Extrapolated from available Alpha 21264 power estimates l Different issue queue designs but both use an out-of-order issuing scheme l Assume issue logic = register file + register mapping + issue queue l Issue queue = register scoreboard + request logic + arbiters l 2/24/2021 COMP 4211 Advance Computer Architecture 37
Power Estimations l Estimates: l l arbitration logic 60% of issue queue power request logic 15% of issue queue power register scoreboard and rests remaining 25% Reminder: Reduce numbers of FIFO reduce activity on the arbiter enable signals, and the request logic and signals power savings! 2/24/2021 COMP 4211 Advance Computer Architecture 38
Request Logic 2/24/2021 COMP 4211 Advance Computer Architecture 39
Request Logic Only request lines of heads of FIFOs are enabled be precharged! l Use the FIFO_head signal to achieve this l REQ_L asserted iff FIFO_head asserted l Conventional out-of-order issue queue: precharges every request lines each cycle! l Execution assignment info (state_cond and Ex_cond) updated no matter what save power only by completely disabling the FIFO (Scheme # 1) l 2/24/2021 COMP 4211 Advance Computer Architecture 40
Arbitration Logic Precharge only the grant lines of heads of FIFO l Assume power used in arbitration logic is directly proportional to the number of active FIFOs save more power by disabling all the grant lines associated with the unused issue slots l 2/24/2021 COMP 4211 Advance Computer Architecture 41
Register Scoreboard Logic Track data dependencies among instructions in the issue queue l Necessary to update information for each issue queue entries unless a FIFO is completely disabled only Scheme # 1 can achieve power saving l 2/24/2021 COMP 4211 Advance Computer Architecture 42
Experimental Methodology Uses SIMPLESCALAR l Original Register Update Unit (RUU) = instruction window + array of reservation stations + reorder buffer (ROB) l RUU spilt into ROB and issue queue (IQ) more accurate modeling of current and next generation processors l ROB order instructions according to their input dependencies before entering the queue l 2/24/2021 COMP 4211 Advance Computer Architecture 43
Complete Configuration 2/24/2021 COMP 4211 Advance Computer Architecture 44
Outline Introduction l Focus of the paper l Overview of Approaches Taken l Related Work Done l Implementations l Experimental Results l Conclusion l 2/24/2021 COMP 4211 Advance Computer Architecture 45
Specific Monitor Technique for Scheme # 1 l Disable one FIFO when either (ordered according to relative importance): less than ¼ of ready instructions are stalled; l less than 2/3 of the FIFOs are actually used on average; l more than 15% of dispatched instructions are non-critical; l current IQ occupancy rate is less than ¼ of the average occupancy rate l 2/24/2021 COMP 4211 Advance Computer Architecture 46
Specific Monitor Technique for Scheme # 1 l Enable one FIFO when either (ordered according to relative importance): current issue rate (IPCissue) drops by more than 10% compared to the last cycle window executed in FPM; l current IPCissue drops by more than 15% compared to the previous cycle window; l more than 1/3 of ready instructions are stalled l 2/24/2021 COMP 4211 Advance Computer Architecture 47
Results for Scheme # 1 2/24/2021 COMP 4211 Advance Computer Architecture 48
Comments on Scheme # 1 l l Only applied to integer benchmarks Reasonable job dynamically changing the 16 4 entry FIFOs But not as good for the non-FIFO (64 1 -entry) scheme; but still for compress 75% power saving with only 3. 6% drop in performance Average best cases: l l 2/24/2021 16 4 -entry FIFOs 27. 6% power saving with 3. 7% drop in performance 64 1 -entry FIFOs 64. 1% power saving but 4. 7% drop in performance (not as impressing) COMP 4211 Advance Computer Architecture 49
Specific Monitor Techniques for Scheme # 2 l Halves the number of FIFOs & doubles the size of each FIFO when either (ordered according to relative importance) : l l l 2/24/2021 (IPCissue – IPCcommit) > 1. 0; less than 3% of ready instructions are stalled; IPCissue < 2. 7 (threshold lowered by 0. 2 for each successive reduction in number of FIFOs); current IQ occupancy rate < 20% of average; (AVG_IPCissue – IPCissue) > 0. 15 (threshold increased by 0. 15 for each successive reduction in number of FIFOs) COMP 4211 Advance Computer Architecture 50
Specific Monitor Techniques for Scheme # 2 l Double number of FIFOs and halves size of each FIFO when either (ordered according to relative importance): current IPCissue drops by > 8% compared to the last cycle window l current IPCissue drops by > 6% compared to the last cycle window in FPM l more than 15% of ready instructions are stalled l 2/24/2021 COMP 4211 Advance Computer Architecture 51
FIFO usage for Scheme # 2 2/24/2021 COMP 4211 Advance Computer Architecture 52
Comments on FIFO usage For several FP benchmarks (applu, apsi, mgrid and swim), can’t reduce number of FIFOs need more flexibility in reordering instructions l For most Integer benchmarks cut the FIFOs at least in half for a significant portion of the running time l 2/24/2021 COMP 4211 Advance Computer Architecture 53
Results for Scheme # 2 2/24/2021 COMP 4211 Advance Computer Architecture 54
Comments on Scheme # 2 Easier to cut number of FIFOs for integer benchmarks save at least 30% of the issue queue power l Most FP benchmarks need 64 FIFOs for a large % of running time but Scheme # 2 works reasonably well (fppp, hydro 2 and su 2 cor) l Average: 27. 3% power saving with only 2. 7% drop in performance l 2/24/2021 COMP 4211 Advance Computer Architecture 55
Outline Introduction l Focus of the paper l Overview of Approaches Taken l Related Work Done l Implementations l Experimental Results l Conclusion l 2/24/2021 COMP 4211 Advance Computer Architecture 56
FINALLY!!!!! Programs vary in ILP l Dynamically reconfigure issue queue to save power l Two approaches taken; Scheme # 2 works more efficiently l THANK YOU & BYE-BYE !!!!!! l Oops. . ONE LAST THING…. . l 2/24/2021 COMP 4211 Advance Computer Architecture 57
References l Yu Bai and R. Iris Bahar. A Dynamically Reconfigurable Mixed In-Order/Out-of-Order Issue Queue for Power-Aware Microprocessors. l James A. Farrell and Timothy C. Fischer. Issue Logic for a 600 -MHz Out-of-Order Execution Microprocessor. l J. E. Smith. Advanced Computer Architecture 1 “Power Efficient Architecture” Lecture Notes. l K. Wilcox and S. Manne. Alpha processors: A history of power issues and a look to the future. 2/24/2021 COMP 4211 Advance Computer Architecture 58
- Comp 4211
- 25092020
- Comp4211
- Bé thì chăn nghé chăn trâu
- Characteristics of micro computers
- Arm full form
- Pmos microprocessor
- Difference between i c $ microprocessors
- Csci 4211
- Jus 4211
- Jus 4211
- Jus4211
- Ece 4211
- Csci 4211
- Amy leung md
- Marvin chan
- Dr wai-lam chan
- Otomotiv sektörü swot analizi örneği
- Helene chan
- Chan park microsoft
- Chan kim srun
- Bayraktar chan
- Dr johanna chan
- Johanna chan pack
- Vincent chan mit
- Jacqueline chan md
- Victor chan
- Dừng chân nghỉ lại nha trang
- Chi
- John baptist chan
- Sherry chan actuary
- Chan kheng hoe
- Doctor paul chan
- Jackie chan chinese name
- Dr keith chan
- Czyngis chan
- Mật thư anh em như thể tay chân
- Nằm cuộn tròn trong chiếc chăn bông ấm áp
- Vẽ mạch logic
- Logic vị từ là gì
- Chấn thương ngực kín slide
- Chan kim blue ocean strategy
- Chan ho mun
- Thêu chăng chặn
- Mỹ thuật lớp 3 chân dung biểu cảm
- Wake forest opcd
- Chan joshi
- Konan chan
- Marsha chan
- Bài hát đường và chân là đôi bạn thân
- Mỹ thuật lớp 3 chân dung biểu cảm
- Hku certificate
- Sơ đồ chân arduino nano
- Chúa chăn nuôi tôi, tôi chẳng thiếu thốn chi
- Davide gaiotto
- Một nhà thơ chân chính
- Chan beng seng
- Seong chan park
- Chan semantik