Fro Ntier Stress Tests at Tier0 Status report

  • Slides: 18
Download presentation
Fro. Ntier Stress Tests at Tier-0 Status report Luis Ramos LCG 3 D Workshop

Fro. Ntier Stress Tests at Tier-0 Status report Luis Ramos LCG 3 D Workshop – September 13, 2006

Outline 1. 2. 3. 4. Test Plan Test Setup Main Results Conclusions Luis Ramos

Outline 1. 2. 3. 4. Test Plan Test Setup Main Results Conclusions Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 2/18

Objectives • Develop a benchmark for Frontier servers – DB schema independent • Build

Objectives • Develop a benchmark for Frontier servers – DB schema independent • Build a tool that identifies performance bottlenecks of a given setup • Performance analysis of the software stack – – CORAL / Frontier plugin Frontier Client Squid Frontier Servlet Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 3/18

Test Plan • How fast are the individual components? – – Database Application Server

Test Plan • How fast are the individual components? – – Database Application Server Cache Server Network • Explore performance impact of: – Different data (strucutre, content, size, storage type, compression) – Different caching policies – Different access methods • How do DB throughput, network bandwidth, payload size, # of clients or server CPU correlate? Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 4/18

Metrics and Parameters • Metrics: – – – Individual and total throughput Server errors

Metrics and Parameters • Metrics: – – – Individual and total throughput Server errors CPU consumption/load (clients, frontier server, squid server) Memory usage and disk space needs Network bandwidth usage • Parameters: – – – # of client nodes # of test clients Payload sizes Database structure and content Caching policy Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 5/18

Fro. Ntier Test Setup Luis Ramos – September 13 th, 2006 Fro. Ntier Stress

Fro. Ntier Test Setup Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 6/18

Server Test Setup • Hardware setup: – 1 server running Frontier & Squid: •

Server Test Setup • Hardware setup: – 1 server running Frontier & Squid: • • Dual Intel Xeon CPU 2. 80 GHz 2 Gb RAM HD 150 Gb Fast Ethernet (100 Mbps) – 1 Backend Oracle Database 10 g. R 2 (cooldev) • Software Setup: – Fro. Ntier v 3. 3 – Frontier Squid v 1. 0 rc 4 Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 7/18

Client Test Setup • Hardware: – Dedicated lxplus nodes • • Dual Pentium III

Client Test Setup • Hardware: – Dedicated lxplus nodes • • Dual Pentium III 1 GHz 500 Mb RAM HD 6 Gb Fast Ethernet (100 Mbps) • Software: – CORAL_1_5_3 – Frontier. Client v 2. 5. 1 Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 8/18

The Test Client • C++ CORAL/Frontier. Plugin test – Queries the server constantly –

The Test Client • C++ CORAL/Frontier. Plugin test – Queries the server constantly – Gathers results – Outputs measures • Until shutdown message received • Python controller script – Starts a number of clients • Manages test client ramp up – Gathers measures – Generates structured data for plotting Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 9/18

Test Cases • Network Analysis • Throughput analysis – CORAL Oracle Plugin – CORAL

Test Cases • Network Analysis • Throughput analysis – CORAL Oracle Plugin – CORAL Frontier Plugin • Directly to Fro. Ntier • Through SQUID Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 10/18

Network Analysis • Test tool that checks network performance between multiple given client nodes

Network Analysis • Test tool that checks network performance between multiple given client nodes and a single server – Generates TCP/IP traffic between clients and server – Each client shows an individual throughput • Done using the netcat utility Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 11/18

Throughput analysis Frontier Server • Up to 150 clients running against a single server

Throughput analysis Frontier Server • Up to 150 clients running against a single server (direct Fro. Ntier server access, no Squid involved) Old version of Fro. Ntier -> no compression! New Frontier version with compression Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 12/18

Throughput analysis Oracle, Fro. Ntier and Squid • Oracle vs Frontier Server vs Squid

Throughput analysis Oracle, Fro. Ntier and Squid • Oracle vs Frontier Server vs Squid Cache Hits Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 13/18

Throughput analysis Notes on previous plots • Direct Frontier access – NOZIP version: 3

Throughput analysis Notes on previous plots • Direct Frontier access – NOZIP version: 3 MBps (bottleneck is the database) – ZIP version: 0, 3 MBps (bottleneck is the server CPU) – ZIP version can get 10 times slower than NOZIP version • Production setup with 3 Fro. Ntier nodes will perform better! • Squid access – NOZIP version: 8 MBps – ZIP version: 14 MBps • user preceived throughput can be bigger than the network throughput (due to compression) – ZIP version can get 2 times faster than NOZIP version • Oracle access - 1, 34 MBps – First guess, should be faster then Fro. Ntier direct access in any case! – Second thought, each client is repeatedelly creating DB connections which is quite heavy for Oracle. Plugin and not so much for Frontier because frontier servlet reuses connections Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 14/18

Throughput analysis Some predictions • CMS predicted real world access patern – 10% direct

Throughput analysis Some predictions • CMS predicted real world access patern – 10% direct Fro. Ntier access – 90% Squid access • Factors (from previous slide) – SQUID_ZIP_time = SQUID_NOZIP_time / 2 – FRT_ZIP_time = FRT_NOZIP_time * 10 • Some calculations: – Real NOZIP query time = 181% ZIP query time • Prediciton test: – Tests run from FNAL nodes will produce new data in different network conditions Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 15/18

Future work • Run multi client throughput tests: – – Using experiment DB content

Future work • Run multi client throughput tests: – – Using experiment DB content Using COOL generated queries Changing the ratio of cached queries From FNAL • to measure the impact of a poorer network connection – Analyzing the Frontier server error rate • with the new Frontier v 3. 3 Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 16/18

Conclusions • Fro. Ntier is ready for production • Some performance indicators were obtained

Conclusions • Fro. Ntier is ready for production • Some performance indicators were obtained – More real performance indicators should be now obtained from the production setup • Test scripts developed – Next step: turn scripts easily reusable by others (Richard Hansen is running the test suite for ATLAS) • Tests will continue! Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 17/18

Questions? Ideas? Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests -

Questions? Ideas? Luis Ramos – September 13 th, 2006 Fro. Ntier Stress Tests - 18/18