Load Testing UNIX Systems Peter Harding plhpha com

  • Slides: 41
Download presentation
Load Testing UNIX Systems Peter Harding plh@pha. com. au www. pha. com. au

Load Testing UNIX Systems Peter Harding [email protected] com. au www. pha. com. au

System & Load testing Two main circumstances you might want to test: – Sizing

System & Load testing Two main circumstances you might want to test: – Sizing new hardware (system acquisition) – Deploying new application (new or upgraded software, eg. Y 2 K) www. pha. com. au

Two Types of Tests In either situation there are two types of test we

Two Types of Tests In either situation there are two types of test we are interested in: – System testing: • Backups • End of Day/End of Month – Load testing: • Capacity – throughput/response time • Scalability www. pha. com. au

A Load Testing Methodology • Typically want to run a system up with a

A Load Testing Methodology • Typically want to run a system up with a client’s application and databases in a hardware configuration reflecting actual client production infrastructure. • Want to find out such things as maximum throughput, expected response times under a designated load. • May want to validate modelled predictions for performance (& capacity). www. pha. com. au

Sample RTE Testing Architectures 1) Monolithic (Development of tests) 2) Two tier (Monolithic application

Sample RTE Testing Architectures 1) Monolithic (Development of tests) 2) Two tier (Monolithic application environment) 3) Multi-tier (Client-server, People. Soft, SAP) www. pha. com. au

Environments • • ASCII (telnet, rlogin, ssh) Windows X Windows (X 11, KDE, Gnome,

Environments • • ASCII (telnet, rlogin, ssh) Windows X Windows (X 11, KDE, Gnome, …) Web – https (SSL) • Other protocols – ftp, nntp, – Tuxedo – Corba – SQL*Net • Java – See: www. precise. com www. pha. com. au

Commercial Tools • Mercury Interactive (www. mercuryinteractive. com) – Win. Runner – Load. Runner

Commercial Tools • Mercury Interactive (www. mercuryinteractive. com) – Win. Runner – Load. Runner • Rational (now owned by IBM) – Rational Robot • Seague (www. seague. com) – Silk. Performer V • Compu. Ware (www. compuware. com) – QALoad • Others – Lots – especially in the web space www. pha. com. au

Single Tier (Insitu) Benchmark Architectures System Under Test (SUT) + RTE System(s) (Client -

Single Tier (Insitu) Benchmark Architectures System Under Test (SUT) + RTE System(s) (Client - Driver System) www. pha. com. au

Two Tier Benchmark Architectures RTE System(s) (Client - Driver System) System Under Test (SUT)

Two Tier Benchmark Architectures RTE System(s) (Client - Driver System) System Under Test (SUT) TTY Lines www. pha. com. au Network Connection (X 25 or Ethernet)

Three Tier Benchmark Architecture RTE System (Driver System) RTE System(s) (Client System) www. pha.

Three Tier Benchmark Architecture RTE System (Driver System) RTE System(s) (Client System) www. pha. com. au System Under Test (SUT)

The General Approach • • • Planning Phase Development Phase Testing Phase Review Phase

The General Approach • • • Planning Phase Development Phase Testing Phase Review Phase Reporting Phase www. pha. com. au

Planning Phase • Conversation with client determine scope of exercise. • Consultation with client

Planning Phase • Conversation with client determine scope of exercise. • Consultation with client resources are required to identify WHAT is to be tested – ie. The TASKS. • Further conversation will hopefully yield indications of expected mix of activities (TASKS). This is often subject to wild speculation. This allows us to specify the load scenarios from which we build the workload files. www. pha. com. au

Development Phase • Familiarize yourself with the application environment (probably using a development environment

Development Phase • Familiarize yourself with the application environment (probably using a development environment for SUT). • Set up test harness and RTE/data server libraries on driver(s). • Record basic navigation through the application. • Record and test each identified TASK. Handling error conditions will require most effort. www. pha. com. au

Test Phase • Setup of System Under Test (SUT) environment will be required about

Test Phase • Setup of System Under Test (SUT) environment will be required about now. • Setup data to be fed to robots during actual tests. • Build workload files. • Run tests. • Do analysis. • Publish report. www. pha. com. au

The Basic Components • • Recorder (record) RTE Library (librte. so) Test Harness (sim.

The Basic Components • • Recorder (record) RTE Library (librte. so) Test Harness (sim. c et al. ) Controller (start /regulate/review) Data Server (dserver) Workload setup (wcalc) Post-processing scripts (postproc. pl) www. pha. com. au

The Test Harness • The test harness consists of a directory containing the C

The Test Harness • The test harness consists of a directory containing the C source code for the test framework and each of the robots. • It also contains: – a Makefile (which links in librte. so and libdcl. so) – Various text files used to configure each run (. regrc, . rterc, Parameters, Seq. No, RTE, Offset, Host, etc. ). – A directory (or symbolic link to one) in which the tree of log files and other test output are stored. – Symbolic links into the log directoy of the current and last run. (These are set up by the start script) www. pha. com. au

The Controller • The controller is a C executable, regulate, which takes a large

The Controller • The controller is a C executable, regulate, which takes a large number of arguments many of which are passed down into the individual robots. • It is typically invoked using the start script which crafts these arguments automagically using various config files in working directory (Seq. No, Offset, RTE, Host, etc. ). www. pha. com. au

Controller Operation • On startup the controller (regulate) reads. regrc file which controls rate

Controller Operation • On startup the controller (regulate) reads. regrc file which controls rate of start up. • It establishes shared memory and semaphores used to provide communication between robots, regulate and review (the viewer). • It enters a loop which fork/execs the child processes and overlays the executable specified in ‘RTE’ • Once all child processes are initiated it falls through into a loop where it executes wait() until all children have terminated and been cleaned up. • The controller can optionally restart children as they terminate if desired. This is specified in. regrc file. www. pha. com. au

Test Monitoring • Tests are monitored by the review program which reads the global

Test Monitoring • Tests are monitored by the review program which reads the global shared memory segment to access status information written by each running robot. • Review uses a ps command in the background to establish the pid of the version of regulate it should monitor. • If multiple tests are running the pid of the controlling regulate to be monitored may be specified on the command line. www. pha. com. au

review • The command line interface within review mimics that used by vi. •

review • The command line interface within review mimics that used by vi. • A text help file is provided which summarises commands available. • review also provided access to log, stdout, stderr and trace files of running robots. www. pha. com. au

The Data Server • The data server uses System V IPC (queues) to pass

The Data Server • The data server uses System V IPC (queues) to pass data from a single server to each of the robots (initiated by each robot). • Data sources are CSV (text) files customized for the tests. • Typically CSV files will be crafted from output of SQL on underlying database but may be hand crafted (say by client personnel using Excel) www. pha. com. au

Data Server Operation • Starting it up: [plh@deneb plh]$ dserver [dserver] Server working directory

Data Server Operation • Starting it up: [[email protected] plh]$ dserver [dserver] Server working directory is /u/pha/plh/DATA Stale dserver pid file! [dserver] Server pid is 2227 Used 0 MByte of malloc space • Shutting it down: [[email protected] plh]$ dserver -T [dserver] Terminating server with pid, 2227 [[email protected] plh]$ [shutdown] dserver shut down Wed Apr 16 15: 28: 49 2003 • On shutdown unused data in memory is flushed to disk. • Makefile is used to reset state of data files from master copies matched to databases of system under test (SUT). www. pha. com. au

Data Server Configuration • Data server files are contained in a directory which should

Data Server Configuration • Data server files are contained in a directory which should be defined in the environment as $DSERVER_DIR. • Configuration is contained in dserver. ini in that directory. • Data is stored in. dat files, used data in. used files and the master copies in. master files. www. pha. com. au

dserver. ini • Format of file is as follows: [plh@deneb DATA]$ cat dserver. ini

dserver. ini • Format of file is as follows: [[email protected] DATA]$ cat dserver. ini [Data] Description=Address, CSV #Description=Name, CSV #Description=Note, CSV #Description=Uniq. No, Sequence #Description=Whs. Item, Keyed #Description=Keyed, Keyed #Description=Indexed, Indexed Description=Creditor. Id, CSV Description=Cred. Activity, CSV Description=Item. Id, CSV Description=Item. Qry. Times, CSV Description=Prod. Code, CSV … Description=Fin. Cred. ATB_Out, CSV Description=Fin. GLAcct. Inq_Out, CSV Description=Fin. Bank. Rec. SR_Out, CSV www. pha. com. au

Workload Generation • The various load scenarios are converted into workload files which completely

Workload Generation • The various load scenarios are converted into workload files which completely choreograph each run by the workload calculator (wcalc). • This is a UNIX command line tool which takes a configuration file and produces a workload file. (eg. wcalc –c Buyer. Lookup. PPCE) • The workload file is read by each of the robots when they start to extract the tasks to be performed and their start-times. www. pha. com. au

Workload Definition File (eg Scen 01_3600. wcd) # # An initial mean load workload

Workload Definition File (eg Scen 01_3600. wcd) # # An initial mean load workload definition # # Date: Tue Jun 4 12: 41: 32 EST 1996 [Config] No. Users=40 Duration=3600 # seconds Stagger. Delay=0 # seconds Restart. Timer=FALSE [Threads] # Name, No. Sessions, No. Tx, Given. Cycle. Time Thread=Buy. Lookup. PPCE, 16, 3200, 0, 120 Thread=Buy. Lookup. PPCE_PF 6, 5, 3000, 0, 120 Thread=Buy. Lookup. PPCE_PF 7, 5, 3000, 0, 120 Thread=Buy. Cyber. SADD, 5, 1800, 0, 120 # # Very quick financial reports # Thread=Fin. APCred. Enq, 1, 600, 0, 120 Thread=Fin. Chq. Listing, 1, 240, 0, 120 Thread=Fin. Cred. ATB, 1, 600, 0, 120 Thread=Fin. GLAcct. Inq, 1, 120, 0, 120 Thread=Fin. GLTBF 1, 1, 120, 0, 120 Thread=Fin. Misc. CRDR, 1, 120, 0, 120 # # Long reports # Thread=Fin. Cyber. GLTR, 1, 10, 0, 120 Thread=Buy. Buyer. POR, 1, 10, 120 Thread=Buy. Cyber. POR, 1, 20, 120 www. pha. com. au

Workload File (eg. Scen 01_3600. wld) # # Generated: wcalc 2. 0 - Thu

Workload File (eg. Scen 01_3600. wld) # # Generated: wcalc 2. 0 - Thu Feb 6 # # Duration: 3600 seconds # No Sessions: 40 # No Working: 40 # Scale Factor: 1 # # Thread Sessions TPH TPR # ======== === # # Buy. Lookup. PPCE 16 3200 # Buy. Lookup. PPCE_PF 6 5 3000 # Buy. Lookup. PPCE_PF 7 5 3000 # Buy. Cyber. SADD 5 1800 # Fin. APCred. Enq 1 600 # Fin. Chq. Listing 1 240 # Fin. Cred. ATB 1 600 # Fin. GLAcct. Inq 1 120 # Fin. GLTBF 1 1 120 # Fin. Misc. CRDR 1 120 # Fin. Cyber. GLTR 1 10 10 # Buyer. POR 1 10 10 # Buy. Cyber. POR 1 20 20 # ---- ---# 40 12840 # 0000, Buy. Lookup. PPCE 0000, 0017, Buy. Lookup. PPCE 0000, 0034, Buy. Lookup. PPCE 0000, 0052, Buy. Lookup. PPCE 10: 08: 58 2003 TPU === 200 600 360 600 240 600 120 120 10 10 20 Delays ======= 17. 4 5. 8 9. 7 5. 8 14. 5 5. 8 29. 0 348. 0 347. 0 173. 0 0 0 10 20 120 120 120 120 www. pha. com. au Start ===== Range ===== Delta ===== 0 0 0 10 20 3480 3480 3480 3470 3460 1. 1 1. 2 1. 9 5. 8 14. 5 5. 8 29. 0 348. 0 347. 0 173. 0

Outputs from Robots Each robot produces four output files: – Log file – Stdout

Outputs from Robots Each robot produces four output files: – Log file – Stdout – Stderr – Trace file NNNN. log NNNN. stdout NNNN. stderr NNNN. trc www. pha. com. au

Log File Format The log file encapsulates: – Benchmark start and end times –

Log File Format The log file encapsulates: – Benchmark start and end times – Task elapsed times – Query response times www. pha. com. au

Log file sample **** S Mon Feb 17 16: 10: 51 2003 [1045458651] RTE

Log file sample **** S Mon Feb 17 16: 10: 51 2003 [1045458651] RTE version 3. 2. 12 -------> I Identifier "99_40" -------> I Session. Idx 9 Range. Offset 0 0. 000 U Concurrency = 10 0. 000 U Display = 0 0. 000 U Delay. Mode = 1 0. 000 U Use. Cycle. Times = 0 0. 000 U Use. Stagger. Delay = 0 0. 000 U Loop. Max = 1000 0. 000 U Max. Wait = 240 0. 000 U Fixed. Sequence = 1 0. 000 U Send. Rate = 0 0. 000 U Stagger. Delay = 1 0. 000 U Timeout. Restart = 0 0. 000 U Timer. Restart. Interval = 300 0. 000 U Tracing = 0 0. 004 U **** "170203" 0. 004 U **** Fixed sequence initialization (Seed = 9) 0. 006 U Adding "Buy. Lookup. PPCE" (idx 12) start 32 sec 0. 006 U Adding "Buy. Lookup. PPCE" (idx 12) start 90 sec www. pha. com. au

Log file sample (Cont) 0. 012 0. 062 0. 101 0. 102 0. 156

Log file sample (Cont) 0. 012 0. 062 0. 101 0. 102 0. 156 0. 210 0. 248 0. 309 51. 920 51. 921 56. 921 83. 923 U U M S S M S M S M U U U Workload file "Scen 04_3600. wld" contains 60 threads. Telnet to "192. 168. 1. 1" 0. 000 "t 01" "^J" 0. 000 "tst 1 ng" "^J" 0. 000 "Ush^M" 0. 000 "^[[M" 0. 000 "FSMENU^M" 0. 000 %Benchmark start at 51. 9 sec Idle. . . [next. Task] current. Thread 12 Task 0 sleeping with delay 27 check. Running() CMD_RUN www. pha. com. au 0. 050 0. 062 0. 039 0. 054 0. 038 0. 061

Log file sample (Cont) 83. 923 83. 927 83. 966 84. 172 84. 211

Log file sample (Cont) 83. 923 83. 927 83. 966 84. 172 84. 211 85. 595 99. 614 99. 663 99. 702 99. 741 99. 742 142. 743 U S M S M U S S M M U U S M S M U U U %Start Buy. Lookup. PPCE at 32. 0 sec "^[[P" 0. 000 0. 004 0. 000 0. 039 0. 000 0. 206 0. 000 0. 039 1. 384 0. 039 1. 423 0. 000 0. 018 0. 019 0. 000 0. 049 0. 000 0. 039 "^[[T" "^[[O" Looking up "21820328"^M "21820328" "^M" %Query Buy. Lookup. PPCE Look. Up 1. 4 sec 21820328, 1423. 0 "^[[d" %Finish Buy. Lookup. PPCE (OK) at 47. 8 Proc 15. 8 [next. Task] current. Thread 12 Task 1 sleeping with delay 43 check. Running() CMD_RUN %Start Buy. Lookup. PPCE at 90. 8 sec www. pha. com. au 0. 000 0. 039 Elap 15. 8 CSD 0. 0 CPD 14

Starting a Run • Use the script start. sh to initiate run. [plh@deneb REJECTSHOP]$.

Starting a Run • Use the script start. sh to initiate run. [[email protected] REJECTSHOP]$. /start. sh 20 Log files will be written to directory results/192. 168. 69. 34/104_20 Starting run 104 on hostname 192. 168. 69. 34 at rate 1 Duration of run will be 1800 seconds Taking 60 samples at intervals of 30 seconds Running "regulate -h 192. 168. 69. 34 -e sim -r 1 -W Tst. Thread 0. wld -w results/192. 168. 69. 34/104_20/logs -i 104_20 -D 1800 -o 0 20" Regulate Process PID 2305 Logfile Generation. . . • • Run the review executable (in working directory). Wait until all robots have paused (P). Type ‘: S’ (Start) to initiate test. Terminate by typing ‘: E’ (End) – robots will complete current task and then exit gracefully. • Abort by typing ‘: X’ (e. Xterminate) – robots all terminate on signal. www. pha. com. au

Review Screen Review is a curses application whose screen looks like this: [V 3.

Review Screen Review is a curses application whose screen looks like this: [V 3. 2. 12] Controlling PID: 2535 Run Duration: 1800 User: plh Started ramp up at: Wed Apr 16 16: 04: 54 2003 Elapsed: 00: 25 -Sessions: started 20 finished 0 failed 0 concurrent 0 of 20 Ident: 105_20 Sess: 0 1 2 3 4 5 6 7 8 0 5 P 5 P 5 P 10 5 P 5 P 5 P 9 5 P 5 P For each session (robot) the number (5) is the send/match count and the letter (P) is the status code. This is set by rte. Sleep() call in robot [e. g. rte. Sleep( 15, ‘C’); ]. www. pha. com. au

review (once started) Once running review will display progress: [V 3. 2. 12] Controlling

review (once started) Once running review will display progress: [V 3. 2. 12] Controlling PID: 2535 Run Duration: 1800 User: plh Started ramp up at: Wed Apr 16 16: 04: 54 2003 Elapsed: 00: 04: 27 SSessions: started 20 finished 0 failed 0 concurrent 0 of 20 Ident: 105_20 Sess: 0 1 2 3 4 5 6 7 8 0 13 A 13 A 5 i 5 i 10 5 i [email protected] [email protected] [email protected] 5 i 5 i 5 i www. pha. com. au 9 5 i 5 i

Post-Processing The running robots each write elapsed and query times into a log file.

Post-Processing The running robots each write elapsed and query times into a log file. These files are post processed to produce analysis for reports. The post-processing phase produces: § Histograms of query response time and task elapsed time. § CSV versions of the above data which is folded into an Excel spreadsheet using a macro. § Various graphs (in Post. Script using Gnu. Plot which are converted into PDF files for distribution). www. pha. com. au

Post-processing Scripts • Historically a combination of shell and Perl scripts have been utilized

Post-processing Scripts • Historically a combination of shell and Perl scripts have been utilized for rehashing the log files to produce. • Most recently Perl has been used in conjunction with Gnu. Plot and Acrobat. www. pha. com. au

Response Time Histogram ***** Buy. Lookup. PPCE_PF 6 -PF 6 ******************* No of queries:

Response Time Histogram ***** Buy. Lookup. PPCE_PF 6 -PF 6 ******************* No of queries: Mean: Standard Deviation: Variance: Median: Mode: Fraction at mode: 580 0. 23 sec 0. 81 0. 65 0. 0 sec (no = 280) 0. 483 (Mode/Total) Percentile Distribution: Time: 50% 0. 0 60% 0. 1 75% 0. 1 80% 0. 1 85% 0. 1 90% 0. 1 95% 0. 2 Query Response Time Distribution: Time: Freq: 0. 0 280 0. 1 216 0. 2 29 0. 3 2 0. 4 3 0. 5 3 0. 6 3 0. 7 5 Time: Freq: 0. 8 6 0. 9 3 1. 0 2 1. 1 1 1. 2 3 1. 3 2 1. 4 1 1. 5 1 Time: Freq: 3. 8 1 3. 9 1 4. 6 1 5. 2 1 5. 8 1 6. 8 1 8. 3 1 8. 6 1 www. pha. com. au 99% 0. 8

Analysis Typically the analysis is performed in conjunction with tools that report on system

Analysis Typically the analysis is performed in conjunction with tools that report on system performance during the test runs: – sar, iostat, netstat, etc. – Various shell and Perl scripts wrapped around sar – Team. Quest www. pha. com. au

Reporting Reports typically consists of two parts – Overall system performance & capability at

Reporting Reports typically consists of two parts – Overall system performance & capability at various workload levels (scenarios). – Throughput and response time at various load levels (scenarios). www. pha. com. au

Conclusions • Labour intensive. • Complex. • Fun (if you like breaking systems and

Conclusions • Labour intensive. • Complex. • Fun (if you like breaking systems and software). • Useful for leveling the playing field in system acquisitions. • Useful for determining load limit and throughput thresholds and bounds. www. pha. com. au