CPE 619 Monitors Aleksandar Milenkovi The La CASA

  • Slides: 44
Download presentation
CPE 619 Monitors Aleksandar Milenković The La. CASA Laboratory Electrical and Computer Engineering Department

CPE 619 Monitors Aleksandar Milenković The La. CASA Laboratory Electrical and Computer Engineering Department The University of Alabama in Huntsville http: //www. ece. uah. edu/~milenka http: //www. ece. uah. edu/~lacasa

Part II: Measurement Techniques and Tools Measurements are not to provide numbers but insight

Part II: Measurement Techniques and Tools Measurements are not to provide numbers but insight - Ingrid Bucher n Measure computer system performance n n n Monitor the system that is being subjected to a particular workload How to select appropriate workload In general performance analysis should know 1. 2. 3. 4. 5. 6. What are the different types of workloads? Which workloads are commonly used by other analysts? How are the appropriate workload types selected? How is the measured workload data summarized? How is the system performance monitored? How can the desired workload be placed on the system in a controlled manner? 7. How are the results of the evaluation presented? 2

Outline n n n Introduction Terminology Software Monitors Hardware Monitors Monitoring Distributed Systems 3

Outline n n n Introduction Terminology Software Monitors Hardware Monitors Monitoring Distributed Systems 3

Monitors That which is monitored improves. – Source unknown n A monitor is a

Monitors That which is monitored improves. – Source unknown n A monitor is a tool used to observe activities on a system n n n Observe performance Collect performance statistics May analyze the data May display results May even suggest remedies n n n Monitors are used not only by performance analysts Systems programmer may profile software System manager may measure resource utilization to find bottleneck System manager may use to tune system System analyst may use to characterize workload System analyst may use to develop models or inputs for models 4

Outline n n n Introduction Terminology Software Monitors Hardware Monitors Monitoring Distributed Systems 5

Outline n n n Introduction Terminology Software Monitors Hardware Monitors Monitoring Distributed Systems 5

Terminology n Event – a change in the system state n n n Trace

Terminology n Event – a change in the system state n n n Trace – a log of events, usually including the time of the event, and other important parameters Overhead – most monitors perturb the system operation n n E. g. : cache miss, page fault, process context switch, beginning of seek on a disk, arrival of a packet, Use CPU or storage; Sometimes called artifact. Goal is to minimize artifact Domain – the set of activities observable by the monitor n E. g. : accounting logs record information about CPU time, number of disks, terminals, networks, paging I/O’s, the number of characters transferred among disks, terminals, networks, and paging devices 6

Terminology (cont’d) n Input rate – the maximum frequency of events that monitor can

Terminology (cont’d) n Input rate – the maximum frequency of events that monitor can correctly observe n n Burst mode: the rate at which an event can occur for a short period of time Sustained mode: the rate the monitor can tolerate for long durations Resolution – coarseness of the information observed Input width – the number of bits recorded for each event. Input rate x width = storage required 7

Monitor Classification n Implementation level n n Trigger mechanism n n n Software, Hardware,

Monitor Classification n Implementation level n n Trigger mechanism n n n Software, Hardware, Firmware, Hybrid Event driven – activated only by occurrence of certain events; n Low overhead for rare event, but higher if event is frequent Sampling (timer driven) – activated at fixed time intervals by clock interrupts n Ideal for frequent events Display n n On-line – provide data continuously. E. g. : tcpdump Batch – collect data for later analysis. E. g. : gprof. 8

Outline n n n Introduction Terminology Software Monitors Hardware Monitors Monitoring Distributed Systems 9

Outline n n n Introduction Terminology Software Monitors Hardware Monitors Monitoring Distributed Systems 9

Software Monitors n n Monitor operating systems, and higher level software, e. g. ,

Software Monitors n n Monitor operating systems, and higher level software, e. g. , networks, databases At each activation, several instructions are executed n n n In general, only suitable for low frequency event or overhead becomes too high Overhead may be OK if timing does not need to be preserved; Lower input rates, lower resolutions, and higher overhead than hardware But, they have higher input widths, higher recording capacities Easier to develop and modify 10

Issues in Software Monitor Design Activation Mechanism n n How to trigger the data

Issues in Software Monitor Design Activation Mechanism n n How to trigger the data collection routine? 1) Trap – instrument the system software with trap instructions at appropriate points. Collect data. Like a subroutine. n n 2) Trace – each instruction is followed by data collection routine (trace mode). Enormous overhead. Time insensitive. n n E. g. : to measure I/O service time, trap before I/O service routine and record time, trap after, take diff E. g. , instruction-trace monitor to produce a PC histogram 3) Timer interrupt – a timer interrupt service provided by the OS is used to transfer control to a data collection routine at fixed intervals. n n Overhead is independent of the event rate If sampling counter, beware of overflows 11

Issues in Software Monitor Design – Buffer Size n n Store recorded data in

Issues in Software Monitor Design – Buffer Size n n Store recorded data in buffers in memory, which are later written to hard disk Buffers should be large n n Buffers should be small n n n To minimize the need to write frequently to hard disk Don’t have a lot of overhead when write to disk Doesn’t impact performance of system (or reduced memory availability is not observable) Optimal buffer size is a function of the input rate, input width, and emptying rate 12

Issues in Software Monitor Design – Number of Buffers n n Usually organized in

Issues in Software Monitor Design – Number of Buffers n n Usually organized in a ring Allows recording (buffer-emptying) process to proceed at a different rate than monitoring (bufferfilling) process n n Monitoring may be bursty Since cannot read while process is writing, a minimum of two buffers required for concurrent access May be circular for writing so monitor overwrites last if recording process too slow May compress to reduce space, but adds overhead 13

Issues in Software Monitor Design – Buffer Overflow n n In spite of a

Issues in Software Monitor Design – Buffer Overflow n n In spite of a ring, all buffers could become full Two options (both result in information loss) n n Overwrite a previously written buffer n Old information is lost Stop monitoring until a buffer becomes available n New information is lost Trade-off: old vs. new information importance Counter overflows 14

Issues in Software Monitor Design – Misc n Data Compression or Analysis n n

Issues in Software Monitor Design – Misc n Data Compression or Analysis n n On/Off n n n Online compression/processing before storing to reduce storage requirements Most hardware monitors have an on/off switch Software can have “if … then” but still some overhead. Or can “compile out” n E. g. : remove “-pg” flag n E. g. : with #define and #ifdef Priority n Asynchronous, then keep low. If timing matters, need it sufficiently high so doesn’t caus skew 15

Outline n n n Introduction Terminology Software Monitors Hardware Monitors Monitoring Distributed Systems 16

Outline n n n Introduction Terminology Software Monitors Hardware Monitors Monitoring Distributed Systems 16

Hardware Monitors n Hardware monitors -- separate pieces of equipment attached to the system

Hardware Monitors n Hardware monitors -- separate pieces of equipment attached to the system being monitored via probes n n n Generally, lower overhead, higher input rate, reduced chance of introducing bugs Can increment counters, compare values, employ timers, record histograms of observed values … n n No system resources are consumed in monitoring Range from simple logic elements and counters to sophisticated computer systems Usually, gone through several generations and testing so is robust 17

Software vs. Hardware Monitors n What level of detail to measure? n n What

Software vs. Hardware Monitors n What level of detail to measure? n n What is input rate? Hardware tends to be faster Expertise? n n n Software more limited to system layer code (OS, device driver) or application or above Hardware may not be able to get above information Good knowledge of hardware needed for hardware monitor Good knowledge of software system (programmer) needed for software monitor Most hardware monitors can work with a variety of systems, but software may be system specific Most hardware monitors work when there are bugs, but software monitors brittle Hardware monitors more expensive 18

Firmware and Hybrid Monitors n n n Firmware monitors fall between hardware and software

Firmware and Hybrid Monitors n n n Firmware monitors fall between hardware and software monitors Implemented by modifying the processor microcode Hybrid: combines hardware, firmware, software monitoring n E. g. , use hardware components to capture events and software modules to compress/analyze collected data 19

Outline n n n Introduction Terminology Software Monitors Hardware Monitors Monitoring Distributed Systems 20

Outline n n n Introduction Terminology Software Monitors Hardware Monitors Monitoring Distributed Systems 20

Monitoring Distributed Systems Distributed system: many hardware and software components working together separately and

Monitoring Distributed Systems Distributed system: many hardware and software components working together separately and concurrently n n n More difficult than single computer system Monitor itself must be distributed Easiest with layered view of monitors May be zero+ components of each layer Many-to-many relationship between layers Layered view of a distributed-system monitor n n n n Management Console Interpretation Presentation Analysis Collection Observation 21

Layered View n n n n Observation – gather raw data on individual components

Layered View n n n n Observation – gather raw data on individual components of the system; each component may have an observer designed specifically for it Collection – collects data from various observers; may have more than one observer on large systems Analysis – Analyzes data gathered at various collectors. May include various statistical routines to summarize the data characteristics Presentation – Deals with human user interface (reports, displays, alarms) Interpretation – Intelligent entity (human or expert system) that can make meaningful interpretations of the data (more sophisticated than simple threshold-based rules) Console – Interface to control the system parameters and states (outside monitor) Management – Entity that makes the decision to set or change system parameters or configuration (manager). Implements decisions suing consoles. 22

Components of a Distributed Systems Monitor Subsystem 1 Subsystem 2 Subsystem 3 Observer 1

Components of a Distributed Systems Monitor Subsystem 1 Subsystem 2 Subsystem 3 Observer 1 Observer 2 Observer 3 Collector 1 Collector 2 Analyzer 1 Analyzer 2 Presenter 1 Presenter 2 Interpreter 1 Interpreter 2 Console 1 Console 2 Manager 1 Manger 2 Human Beings 23

Observation (1 of 2) n n Concerned with data gathering Implicit spying – promiscuously

Observation (1 of 2) n n Concerned with data gathering Implicit spying – promiscuously observing the activity on the bus or network link n n Little impact on existing system Accompany with filters that can ignore some events E. g. : tcpdump between two IP address Explicit instrumentation – incorporating trace points, hooks, … Adds overhead, but can augment implicit data n E. g. : may have application hooks logging when data sent 24

Observation (2 of 2) n Probing – making “feeler” requests to see performance n

Observation (2 of 2) n Probing – making “feeler” requests to see performance n n E. g. : packet pair techniques to gauge capacity (a special packet sent to a given destination and looped back may provide info about queuing at the source, intermediate bridges, the destination, and back There is overlap between the three techniques, but they are not totally redundant -- often one shows a part of the system that others cannot 25

Collection n Data gathering component, perhaps from several observers n n May have different

Collection n Data gathering component, perhaps from several observers n n May have different collectors share same observers n n n E. g. : I/O and network observer on one host could go to one collector for the system Collectors can poll observers for data Or observers can advertise when they have data Clock synchronization can be an issue n Usually aggregate over a large interval to account for skew 26

Analysis n n n More sophisticated than collector Division of labor unclear, but usually,

Analysis n n n More sophisticated than collector Division of labor unclear, but usually, if fast, infrequent in observer, but if takes more processing time, put in analyzer Or, if it requires aggregate data, put in analyzer n n Ex: if successful transaction rate depends upon disk error rate and network error rate then analyzer needs data from multiple observers General philosophy, simplify observers and push complexity to analyzers 27

Presentation (1 of 2) n n n User interface, closely tied with monitor function

Presentation (1 of 2) n n n User interface, closely tied with monitor function Three key functions 1) Performance monitoring – helps quantify if service provided is correct n n n Throughput, response time, utilization of different components Summary statistics Time stamped traces 28

Presentation (2 of 2) n 2) Error monitoring – incorrect performance n n n

Presentation (2 of 2) n 2) Error monitoring – incorrect performance n n n Error statistics, counts or traces Maybe sort to help determine what part of system is unreliable 3) Configuration monitoring – non-performance of the system components n n Tell which are up Show initial configurations May show only incremental configurations Scope to allow zoom or whole system 29

Interpretation and Console n Interpreter – uses set of rules to make judgments about

Interpretation and Console n Interpreter – uses set of rules to make judgments about state of system n n n Often need expert system to warn about faults before they occur May suggest configuration changes Console functions – allow system manager to change system, bring up and down, allow remote diagnostics n Ideally, one console can get feedback and apply configuration, but some parts may be vendor specific 30

Real-World Examples

Real-World Examples

Performance Tuning n Performance tuning steps n n 1) Define the performance problem 2)

Performance Tuning n Performance tuning steps n n 1) Define the performance problem 2) Identify the bottlenecks using monitoring and measurement tools 3) Remove bottlenecks by applying a tuning methodology 4) Repeat steps 2 and 3 until you find a satisfactory resolution 32

Measuring Execution Time n No changes to the program n n n Added to

Measuring Execution Time n No changes to the program n n n Added to the program code directly n n n date time clock gettimeofday Program profilers n gprof 33

Using the date Command n n sr 4 $ date && dsize 12 &&

Using the date Command n n sr 4 $ date && dsize 12 && date Thu Jan 11 16: 04: 58 CST 2007 -1473822656 TOT_INS: 490005749 Thu Jan 11 16: 04: 59 CST 2007 Read ~/docs/ performance. measurement. txt * To learn more about the date command type in man date. sr 4 $ date && dsize 24 && date Thu Jan 11 16: 08: 16 CST 2007 1529910656 TOT_INS: 946006155 Thu Jan 11 16: 08: 18 CST 2007 sr 4 $ date && dsize 36 && date Thu Jan 11 16: 07: 39 CST 2007 1604971008 TOT_INS: 1402006388 Thu Jan 11 16: 07: 42 CST 2007 34

Using the time Command n n Read ~/docs/ performance. measurement. txt * To learn

Using the time Command n n Read ~/docs/ performance. measurement. txt * To learn more about the date command type in man time. sr 4 $ time dsize 24 1529910656 TOT_INS: 946006063 real user sys 0 m 2. 154 s 0 m 1. 980 s 0 m 0. 070 s sr 4 $ time dsize 12 -1473822656 TOT_INS: 490005733 real user sys 0 m 1. 217 s 0 m 1. 040 s 0 m 0. 090 s sr 4 $ time dsize 36 1604971008 TOT_INS: 1402006545 real user sys 0 m 3. 084 s 0 m 2. 930 s 0 m 0. 090 s 35

Using the clock() Function n The clock() function allows you to measure the time

Using the clock() Function n The clock() function allows you to measure the time spent in a section of a program * To learn more about the clock() function type in man clock * A typical program template for using the clock() function #include <time. h>. . int main(void) { clock_t start_time, finish_time; . . . // determine overhead start_time = clock(); finish_time = clock(); double delay_time = (double) (finish_time - start_time); . . . start_time = clock(); . . . // code you want to determine the execution time for finish_time = clock(); double elapsed_time = finish_time - stat_time - delay_time; double elapsed_time_sec = elapsed_time/CLOCKS_PER_SEC; . . . } 36

Using the gettimeofday() function n n * To learn more about this function type

Using the gettimeofday() function n n * To learn more about this function type in man gettimeofday The function gettimeofday returns two integers n n n The first one indicates the number of seconds from January 1, 1970 and the second returns the number of microseconds since the most recent second boundary. #include <stdio. h> #include <sys/time. h> struct timeval start, finish ; int msec; int main () { gettimeofday (&start, NULL); sleep (200); /* wait ~ 100 seconds */ gettimeofday (&finish, NULL); msec = finish. tv_sec * 1000 + finish. tv_usec / 1000; msec -= start. tv_sec * 1000 + start. tv_usec / 1000; * A sample program that uses gettimeofday(). printf("Time: %d millisecondsn", msec); } 37

Program Profiling n n n Profilers are utility programs used to determine execution profiles,

Program Profiling n n n Profilers are utility programs used to determine execution profiles, in other words they tell us how much time is spent in each subroutine or function 10 -90 rule of thumb states that 10% of your code is responsible for 90% of the program execution time Tuning the most time-consuming subroutines that dominate execution time can be very rewarding (assuming that we do this right) The profiler collects the data during the program's execution Typical steps in profiling are as follows: n n n enable it when compiling and linking programs a profiling data file are generated when the program is executed profiling data are analyzed using gprof 38

Example: gprof An excerpt from testsort. report: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@. . granularity: each sample hit covers

Example: gprof An excerpt from testsort. report: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@. . granularity: each sample hit covers 4 byte(s) for 0. 05% of 21. 18 seconds % cumulative time seconds 47. 2 9. 99 36. 0 17. 61 11. 7 20. 08 2. 1 20. 52 1. 6 20. 86 0. 8 21. 02 0. 8 21. 18 0. 0 21. 18 self seconds calls 9. 99 7. 62 5894908 2. 47 70536890 0. 44 1 0. 34 10000000 0. 16 1 0. 16 0. 00 24 0. 00 12 0. 00 3 0. 00 2 0. 00 1 self ms/call 0. 00 440. 00 160. 00 0. 00 total ms/call name internal_mcount [5] 0. 00 partition [4] 0. 00 swap [6] 10530. 00 quicksort [3] 0. 00 rand [8] 500. 00 fill. Array [7] _ mcount (665) 0. 00 _return_zero [329] 0. 00 _mutex_unlock [330] 0. 00 mutex_lock [9] 0. 00 atexit [10] 0. 00 get_mem [11] 0. 00 free_mem [12] 0. 00 _atexit_init [331] 39

PAPI Interface n Read PAPI documentation at http: //www. ece. uah. edu/~milenka/cpe 61908 S/docs/papi.

PAPI Interface n Read PAPI documentation at http: //www. ece. uah. edu/~milenka/cpe 61908 S/docs/papi. README. ver 2. s 07. txt 40

Tuning Example n n sample 1. c – prints the prime numbers up to

Tuning Example n n sample 1. c – prints the prime numbers up to 50, 000 Optimize it using gprof #include <stdlib. h> #include <stdio. h> int prime (int num); int main() { int i; int colcnt = 0; for (i=2; i <= 50000; i++) if (prime(i)) { colcnt++; if (colcnt%9 == 0) { printf("%5 dn", i); colcnt = 0; } else printf("%5 d ", i); } putchar('n'); return 0; } int prime (int num) { /* check to see if the number is a prime? */ int i; for (i=2; i < num; i++) if (num %i == 0) return 0; return 1; } 41

Tuning Example (cont’d) #include <stdlib. h> #include <stdio. h> n n Compile it using

Tuning Example (cont’d) #include <stdlib. h> #include <stdio. h> n n Compile it using –pg option gprof –b. /sample 1 Analyze output => almost all time is spent in the prime routine Use gcov to look at the actual number of times each line of the program was executed (hot spots) int prime (int num); int main() { int i; int colcnt = 0; for (i=2; i <= 50000; i++) if (prime(i)) { colcnt++; if (colcnt%9 == 0) { printf("%5 dn", i); colcnt = 0; } else printf("%5 d ", i); } putchar('n'); return 0; } int prime (int num) { /* check to see if the number is a prime? */ int i; for (i=2; i < num; i++) if (num %i == 0) return 0; return 1; } 42

Tuning Example (cont’d) n n sample 2. c – use sqrt to reduce the

Tuning Example (cont’d) n n sample 2. c – use sqrt to reduce the number of operations in the hot sport Repeat steps, measure performance #include <stdlib. h> #include <stdio. h> #include <math. h> int prime (int num); int faster (int num); int main() { int i; int colcnt = 0; for (i=2; i <= 50000; i++) if (prime(i)) { colcnt++; if (colcnt%9 == 0) { printf("%5 dn", i); colcnt = 0; } else printf("%5 d ", i); } putchar('n'); return 0; } int prime (int num) { /* check to see if the number is a prime? */ int i; for (i=2; i <= faster(num); i++) if (num %i == 0) return 0; return 1; } int faster (int num) { return (int) sqrt( (float) num); } 43

Homework #3 n n Read chapters 7 (and 8) Read documents in /docs directory

Homework #3 n n Read chapters 7 (and 8) Read documents in /docs directory n n n Write a program that prints first N prime number (N should be input from the command line) n n n n performance. measurements. txt papi. README. ver 2. s 07. txt Measure execution time using time command Measure execution time using clock() function Measure the number of clock cycles the program take using PAPI Profile the program using gcov and gprof Due: Monday, February 4, 2008, 12: 45 PM Submit by email to instructor with subject “CPE 619 -HW 3” Name file as: First. Name. Second. Name. CPE 619. HW 3. doc 44