Embedded System Lab Active Flash Towards EnergyEfficient InSitu
Embedded System Lab. Active Flash: Towards Energy-Efficient, In-Situ Data Analytics on Extreme-Scale Machines Devesh Tiwari, Sudharshan S. Vazhkudai, Youngjae Kim, Xiaosong Ma, Simona Boboila, and Peter J. Desnoyers Kilmo Choi rlfah 926@naver. com 최길모 Embedded System Lab.
Contents l Background l Problems and Challenges l Active Flash Approach for In-situ l Active Computation Feasibility l Evaluation l Active. Flash Prototype based on Open. SSD Platform l Conclusion 최길모 Embedded System Lab.
Background 최길모 Embedded System Lab.
Background l Scientific Discovery : Two-Step Scientific Simulation Data Analysis and Visualization Scientific Discovery 최길모 Embedded System Lab.
Background l Large-scale leadership computing applications produce big data £ 최길모 GTC produces ~30 TB output data per hour at-scale. Embedded System Lab.
Problems and Challenges l Offline approach suffers from both performance and energy inefficiencies l £ Redundant I/O(simulations write, analyses read) £ Excessive data movement £ Extra energy cost Energy efficiency will become the primary metric for system design, as compute power is expected to increase by x 1000 in the next decade with only a x 10 increase in power envelope l Using simulation nodes for data analysis not acceptable 최길모 Embedded System Lab.
Active Flash Approach for In-situ l SSDs now being adopted in Supercomputers(e. g. Tsbame, Gordon) £ l higher I/O throughput and storage capability SSD controllers becoming increasingly powerful £ multi-core low-power processors l Idle cycles at SSD controllers l In-situ analysis £ analysis on in-transit output data, before it is written to the PFS £ eliminates redundant I/O, but it use expensive compute nodes 최길모 Embedded System Lab.
Active Flash Approach for In-situ l Active flash £ £ In-situ analysis on SSDs Exploit the computation at idle cycles of the SSD controller Reduce transfer costs high performance and energy saving 최길모 Embedded System Lab.
Active Flash Approach for In-situ l Three approach to data analysis £ £ £ 최길모 offline active flash analysis node Embedded System Lab.
Active Computation Feasibility l Modeling SSD Deployment £ Multiple constraints Capacity § Enough SSDs to sustain output burst Performance § High I/O bandwidth to SSD space § Fast restart from application checkpoints Write durability § 최길모 SSD write endurance limits Embedded System Lab.
Active Computation Feasibility £ Staging Ratio § 최길모 How many simulation nodes share one common SSD? Embedded System Lab.
Active Computation Feasibility l Modeling active computation feasibility £ Relatively less compute intensive kernels better suited for active computation(e. g. regex matching) £ Dependent on multiple factors : simulation data production rate, staging ratio, I/O bandwidth, etc. 최길모 Embedded System Lab.
Evaluation l Cray XT 5 Jaguar supercomputer l Samsung PM 830 SSD l Intel Core i 7 processors 최길모 Embedded System Lab.
Evaluation l Feasibility of the analysis node approach £ Most data analysis kernels can be placed on SSD controllers without degrading simulation performance £ Additional SSDs are not required for supporting in-situ data analysis on SSDs £ Analysis node approach is feasible at higher staging ratios, but at additional infrastructure cost 최길모 Embedded System Lab.
Evaluation l Energy and cost saving analysis £ Staging ratio = 10 £ Active Flash and offline approach : y 1 analysis node : y 2 £ Offline model consumes more energy due to the I/O wait time 최길모 Embedded System Lab.
Conclusion l Extant approaches to scientific data analysis(e. g. offline and analysis nodes) are stymied by several inefficiencies in data movement and energy consumption that results in sub-optimal performance l Active flash is better than either approaches for all of the aforementioned metrics 최길모 Embedded System Lab.
- Slides: 16